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CHAPTER  I 


INTRODUCTION 

In  recent  years,  the  use  of  stochastic  systems  becomes  more  extensive  in 
the  study  of  complex  phenomena.  As  the  complexity  grows,  it  is  more  and 
more  difficult  in  obtaining  analytic  results.  Therefore,  the  use  of  computer 
simulation  to  study  complex  stochastic  systems  has  been  widely  adopted. 
Typically,  we  first  construct  a  model  which  captures  the  underlying  structure 
of  the  stochastic  system.  We  then  perform  sampling  experiments  on  the  model 
and  analyze  the  simulation  output  sequences  to  make  statistical  inferences 
about  the  behavior  of  the  system.  Since  results  are  based  on  the  observations 
from  our  experiments,  it  is  important  to  develop  theoretically  sound  and 
computationally  efficient  methods  for  simulation  output  analysis.  This  is  our 
main  concern  here. 

In  general,  we  want  to  estimate  parameters  associated  with  the  steady- 
state  distribution  of  a  stable  stochastic  process.  We  use  confidence  inter¬ 
vals  for  the  quantities  of  interest  to  assess  the  statistical  precision  of  our 
point  estimates.  To  construct  confidence  interval  for  the  characteristic  of 
the  system  under  study  requires  the  knowledge  of  the  variance  of  the  es¬ 
timate.  Different  methods  have  been  developed  to  evaluate  this  quantity. 
The  methods  currently  being  used  art  the  regenerative  method,  independent 
replications,  batch  means,  and  an  autoregressive  approach.  Except  for  the 
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replication  method,  all  methods  are  based  on  one  output  sequence  recorded 
from  a  single  simulation  run. 

The  regenerative  method  is  based  on  limit  theorems  developed  for  regenera 
tive  stochastic  processes;  see  Crane  and  Lemoine  (1977)  and  Iglehart  (1978) 
for  an  introduction  to  and  a  detailed  review  of  the  regenerative  method. 
To  use  it,  one  has  to  explore  the_  existence  of  a  regenerative  structure  and 
define  an  appropriate  state  vector  to  carry  out  the  simulation.  In  practice, 
simulations  arise  in  which  simulators  may  be  reluctant  to  devote  the  time 
required  to  do  so  or  the  regenerative  property  may  be  absent.  In  these  cases, 
alternative  approaches  for  simulation  output  analysis  play  an  important  role. 
The  autoregressive  method  is  developed  to  cope  with  these  limitations  of  the 
regenerative  method. 

The  autoregressive  approach  has  been  discussed  by  Fishman  (1973,  1978). 
In  this  paper  we  discuss  an  autoregressive  method  which  is  a  refinement  of 
the  old  one  and  is  generalized  to  multidimensional  processes.  In  contrast 
to  the  regenerative  method,  the  autoregressive  method,  which  is  based  on 
theorems  developed  for  stationary  processes,  is  a  method  of  approximation. 
It  relies  on  the  assumption  that  the  variance  constant  required  for  assessing 
the  precision  of  point  estimates  can  be  approximated  arbitrarily  closely  by 

the  spectral  density  function  at  zero  of  a  finite  order  autoregressive  process. 

/ 

/ 

Instead  of  estimating  the  variance  directly  we  use  techniques  developed  for 
time  series  analysis  to  get  an  approximation.  This  is  our  first  goal. 

Although  simulation  is  useful,  it  can  be  a  very  expensive  tool  to  use. 
Since  considerable  computer  time  is  required  for  simulation  runs,  it  is  there¬ 
fore  important  to  obtain  as  precise  results  as  possible  from  the  simulation. 


The  second  goal  of  this  paper  is  to  develop  several  variance  reduction  tech¬ 
niques  which  can  be  used  in  conjuction  with  the  autoregressive  method  for 
obtaining  additional  variance  reduction  for  the  estimate.  Our  approach  to 
this  objective  is  to  introduce  some  auxiliary  processes,  which  are  correlated 
with  the  original  process  under  study;  along  with  the  original  process  and 
apply  the  multidimensional  version  of  the  autoregressive  method. 

A  natural  starting  point  for  achieving  our  objectives  is  a  review  of  the 
theory  of  stationary  processes.  Chapter  2  provides  a  review  of  the  general 
theory,  applicable  limit  theorems,  and  some  linear  models  of  stationary  processes 
We  also  develop  some  approximation  theorems  for  continuous  spectral  density 
function.  It  is  then  demonstrated  that  a  continuous  spectral  density  function 
can  be  approximated  arbitrarily  closely  by  the  corresponding  spectral  density 
functions  of  some  finite  order  linear  models. 

Chapter  3  provides  the  estimation  method  for  strictly  stationary  processes. 
This  chapter  contains  a  discussion  of  order  selection  criteria  along  with  the 
establishment  of  the  .consistency  of  estimates.  In  the  last  section,  we  estab¬ 
lish  some  conditions  which  allow  us  to  apply  the  autoregressive  method  to 
simulation  output  data  and  justify  the  autoregressive  method  for  certain 
non-stationary  processes.  Hence,  the  method  is  applicable  to  the  simulation 
of  discrete  or  continuous  time  Markov  processes  and  semi-Markov  processes. 

Several  variance  reduction  techniques,  which  enable  us  to  shorten  the . 
confidence  interval  constructed,  are  developed  in  Chapter  4.  In  the  applica¬ 
tion  of  these  techniques  a  new  point  estimate  is  formed  by  taking  a  linear 
combination  of  the  old  point  estimate  and  the  point  estimates  obtained  from 
those  /auxiliary  processes.  The  coefficients  of  this  linear  combination  which 


minimize  the  variance  of  the  new  point  estimate  must  be  estimated. 

To  see  how  the  method  works,  Chapter  5  contains  several  numerical 
examples.  They  are  (1)  the  waiting  time  process  in  an  M/M/1  queue,  (2) 
the  outflow  process  of  a  lake  model,  (3)  the  passage  time  and  response  time 
processes  in  a  closed  network  of  queues,  (4)  the  queue  length  process  in  a 
two-station  single  server  cyclic  queue.  Except  for  the  process  in  Example  4, 
which  is  a  semi-Markov  process,  all  processes  are  Markov  processes.  All  these 
processes  are  regenerative  processes;  hence  we  include  the  simulation  results 
obtained  by  using  the  regenerative  method  for  comparison.  In  each  example, 
we  are  able  to  provide  the  theoretical  values. 

In  Chapter  6,  we  examine  the  strengths  and  weaknesses  of  the  autoregres¬ 
sive  method. 
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CHAPTER  H 


STATIONARY  PROCESSES  WITH  A 
DISCRETE  TIME  PARAMETER 


2.1.  General  Definitions  and  Theorems 

Let  E  be  some  measurable  space,  £  a  er-field  of  subsets  of  E  with  a 
probability  measure  P.  A  random  variable  is  a  measurable  function  from  E 
to  the  real  line  5R.  A  stochastic  process  y  with  discrete  time  parameter  is  a 
family  of  random  variables  {l/(n)  :  n  £  T  }.  Here,  y(n)  is  the  observation 
at  time  n  and  T  is  the  time  range  involved,  where  T  =  { 0, 1, 2, . . . }  or 
T  =  { . . . ,  — 1, 0, 1, ...}.  And  by  a  d-dimensional  random  process  y  we  will 
mean  a  column  vector  consisting  of  d  random  processes  {yy(n)  :  n  £  T}, 

j  ”  1, 2,  • . . ,  d, 

v(n)  =  (yi  (n),  y2{n), . . . ,  yd(n))', 
where  '  denotes  the  transpose  of  a  vector. 

(2.1.1)  DEFINITION,  (a)  A  stochastic  process  y  =  (y(n)  :  n  £  T}  is 
said  to  be  strictly  stationary  if  the  joint  distribution  of  y(ni  +  n),  y(n2  +  n), 

•  •  •  >y{nk+n)  is  independent  of  n  for  every  finite  set  of  integers  {ni,ri2, . . .  ,njt } 
of  T  and  for  every  integer  n  such  that  {  ni  +  n,  n2  +  n, . . . ,  n*  +  n  }  C  T, 
i.e., 

P{  yy,  (ni  +  n)  £  Slf . . . ,  yjk (nfc  +  n)  £  Bk  }  = 

P{yj\{ni)  €  Bi,...,yjk{nk)  e  Bk}, 
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for  all  Borel  sets  B\,...,Bk  £  £,  and  every  subset  {  ji,  32, . . . ,  jk  }  of  { 1, 2, . . . ,  d }. 
(b)  A  weakly  stationary  process  y  is  a  stochastic  process  having  finite  second 
moments  (E{  y*(n)2  }  <  00),  a  constant  mean  fj.  =  E{  y[n) },  and  its  covariance 
function 

R  (n)  =  E{  (y(n  +  m)  -  fj.)(y(m)  -  /*)*  },  (2.1.2) 

exists,  is  finite,  and  docs  not  depend  on  m.  Here  *  denotes  the  conjugate 
transpose  of  a  matrix  or  Vector. 

Obviously,  any  process  which  is  stationary  in  the  strict  sense  and  has 
finite  covariance  matrix  is  weakly  stationary.  Two  weakly  stationary  processes 
x  =  {x(n)  :  n  £  T}  and  y  =  (y(n)  :  n  £  T}  are  said  to  be  stationarily 
correlated,  if  their  joint  covariance  function 

Rxy{m,  n)  =  E{  (x(m)  -  M*)Wn)  ~  My )*  }, 
exists  and  depends  only  on  the  difference  m  —  n. 

(2.1.3)  EXAMPLE.  White  Noise.  Let  . ..,e(—  1),  e(0),  c(l), ...  be  a 
sequence  of  d-dimensional  random  vectors  with 

E{  e(n) }  =  0, 

E{  c(n)2  }  =  G, 

where  0  is  the  zero  vector  and  G  is  non-negative  definite*,  and  such  that  any  ' 
two  different  vectors  are  uncorrelated,  that  is, 

E{e(n)e*(m) }  =  Oj,  for  nj^m, 
matrix  G  is  non-ncgativc  definite  if  Oi'Goc  >  0  for  any  com  pi  ex- valued  vector  Of. 


where  0<i  denotes  the  d  X  d  zero  matrix.  Then  the  process  e  =  {  e(n) :  — oo  < 

n  <  oo  }  is  weakly  stationary  with  covariance  function 

[G,  k  =  0; 

l0cf> 


otherwise. 


The  time  series  e(n)  is  usually  called  white  noise. 

(2.1.4)  EXAMPLE.  Stationary  Markov  Chains.  Let  {X(n)  :  n  >  0} 
be  a  Markov  chain  for  which  the  initial  state  X(0)  is  chosen  according  to  the 
stationary  distribution.  Then  (X(n)  :  n  >  0}  is  strictly  stationary. 

The  class  of  covariance  functions, -12 ,  defined  by  Equation  (2.1.2)  can  be 
described  by  the  following  theorem. 

(2.1.5)  THEOREM.  The  covariance  function  is  non-negative  definite,  that 


12(-n)  =  J2*(n), 


Y,  <R(m - >0,  N  =  1,2,...  (21'6) 

m,n=»l 

for  every  set  of  complex  vectors  txi, . . . ,  Conversely,  any  function  R 
satisfying  (2.1.6)  is  the  covariance  function  of  a  stationary  process. 

Proof.  See  Doob  (1953)  p.  473.  Q 

(2.1.7)  THEOREM.  If  R  is  the  covariance  function  of  a  weakly  stationary 


process,  then 


R{n)=  [  einXF{d\), 

J -IT 


where  F  is  a  matrix-valued  function  whose  increments,  F{\\)  —  JP(X2),  \i  > 
X2,  are  Hermitian  non-negative^.  The  function  F  is  uniquely  defined  if  we 
require  in  addition  (i)  F(—it)  =  0,  and  (ii)  F(\)  is  right-continuous. 

Proof.  See  Hannan  (1970)  pp.  34-37.  B 

*A  matrix  F  is  Hermitian  non-negative  if  F  —  F*  and  F  is  non-negative  definite. 


The  function  F  is  called  the  spectral  distribution  function  and  we  can 
write  F  in  the  form 


F  =  Fae  +  Fa  +  Fd) 


where  Fae  is  the  absolutely  continuous  part  of  F,  Fs  is  the  singular  part,  and 
Fd  is  the  discrete  part.  If  F  is  absolutely  continuous,  (i.e.,  Ft  =  Fd  =  0d) 
then 


F(\)  = 


du, 


where  f  is  called  the  spectral  density,  function.  The  matrices  /(X)  are  also 
Hermitian  non-negative  for  —7 r  <  X  <  7r;  that  is  /(X)  =  /*(X)  and 

ot*f{\)oi  >  0  for  every  complex-valued  vector  a.  Throughout  this  paper, 
we  shall  only  be  concerned  with  real  processes  with  absolutely  continuous 
spectral  distribution  functions.  Since  we  deal  only  with  real  processes,  {  y(n ) : 
n  >  0},  R{n)  is  real,  and  R{n)  —  R'{—n).  A  simple  calculation 


(R(n)  =  i2'(-n)), 


shows  that  /(—X)  =  /A(X).  If  £^L=-ool^*'j(n)|2  <  oo>  then  the  numbers 
Ri3{n)/2n  are  simply  the  Fourier  coefficients  of  the  Fourier  series  expansion 
of  the  function  /«j(X),  thus 


/(M  =  ^  f)  «(")< 


— inX 


(2.1.8) 


Equation  (2.1.8)  is  the  Fourier  series  representation  of  /(X)  and  thus  / 
is  continuous.  Note  that  Equation  (2.1.8)  also  follows  from  the  condition 
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E~=-ool%(n)!  ^  °°>  since  an  absolutely  summable  series  is  automatically 
square  summable.  The  next  theorem  describes  the  class  of  spectral  distribu¬ 
tion  functions. 

(2.1.9)  THEOREM.  In  order  that  the  matrix  function  F  be  the  spectral 
distribution  function  of  some  weakly  stationary  d-dimensional  process,  it  is 
necessary  and  sufficient  that  the  matrix  F(\ 2)  —  .F(Xi)  be  Hermitian  non¬ 
negative  for  7 r  >  X2  >  Xi  >  — 7r. 

Proof.  See  Rozanov  (1967)  pp.  22-23:  Q 

Before  we  state  the  next  theorem,  we  need  the  following  definitions. 

(2.1.10)  DEFINITION.  A  random  process  {z(t)  :  —00  <  t  <  00}  is  said 
to  have  orthogonal  increments  if  E{  (x(Xi)  —  z(X2))(z:(X3)  —  ;z(X4))*  }  =  0d, 
Xi  >  X2  >  X3  >  X4. 

(2.1.11)  DEFINITION.  Let  x^,  n  =  1,  2, ...  be  a  sequence  of  random 
variables  for  which  E{  x\  }  <  00.  Then  xn  is  said  to  converge  in  mean  square 
to  a  random  variable  x  if 

lim  E{  |xn  —  x|2  }  =  0.  (2.1.12) 

n— *  00 

A  necessary  and  sufficient  condition  that  x  exists  satisfying  (2.1.12)  is 
the  Cauchy  condition, 

.  lim  E{  |xm  -  xn|2  }  =  0. 

m,n-+  00 


We  now  state  the  spectral  representation  theorem  for  stationary  processes. 


(2.1.13)  THEOREM.  Every  weakly  stationary  process  {  y(n) :  — oo  <  n  < 
oo  }  admits  a  spectral  representation 


(2.1.14) 


where  the  process  {  z(X) :  —  n  <  X  <  7r }  has  orthogonal  increments  and 


E{z(d\)z*(d\)}  =  F(d\). 

Defining  ar(X)  to  be  right-continuous,  it  is  then  uniquely  determined  neglecting 
a  set  in  E  of  probability  measure  zero. 


Proof.  See  Hannan  (1970)  p.  41.  Q 

The  symbol  d\,  which  appears  in  integral  (2.1.14),  will  be  thought  of  as 
a  very  small  interval  containing  X.  Also,  d\,  dfi, . . .  will  be  small  intervals 
containing  X,  /z, . . . ,  respectively.  Then 

(F{d\),  if  X=/i; 

E{z{d\)z\dn)}  =  l 

if  X  n 

The  spectral  representation  can  be  modified  if  .F(X)  is  absolutely  continuous(cf. 
Hannan  (1970)  pp.  122-123,  Rozanov  (1967)pp.  39-41).  In  that  case,  if  there 
is  a  measurable  matrix  v?(X)  satisfying 

dF{\) 


V?(X)<p*(X)  = 


d\ 


=  /(X), 


then  there  is  a  process  (ai(X)  :  —n  <  X  <  7r}  with  orthogonal  increments 
which  satisfies 


y(n)  =  j 


e,nX<p(X)zi(dX),  ^{^(dX^KdX)}^  Idd\,  (2.1.15) 


where  Id  denote  the  d  X  d  identity  matrix.  This  representation  will  be  used 
later. 
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2.2.  Linear  Transformations  of  Stationary  Processes 


Let  y  =  { y(n)  :  — oo  <  n  <  oo  }  be  some  stationary  process  whose 
spectral  representation  is 


y(n)  =  J_  -•'*x 


e  z\ 


(d\). 


We  will  say  that  the  process  x  =  {  x(n) :  — oo  <  n  <  oo  }  is  obtained  from  y 
by  a  linear  transformation,  if  x  admits  a  spectral  representation  of  the  form 

x(n)  =  J  e,nX  h(\)z(d\), 

where  h(\)  satisfies 

J  h(\)F(d\)h*(\)  <  oo. 


We  will  call  the  function  h  the  frequency  response  function  of  the  linear 
transformation.  The  joint  covariance  function  of  x  and  y  is 


E{x{m)y*(n)}  — 


E{  ( J  e,  mX  /i(X)z(dX))(  J  e,nM  z{dii))*  } 
=  E{  J  J  ei  mX~intih(\)z(d\)z*(dfi) } 

=  [  e*(*n-n)\  h(X)F(d\),  (cf.  p.  10), 

J  —  If 


which  depends  only  on  m  —  n.  Thus  the  stationary  processes  x  and  y  are 
stationarily  correlated  as  defined  in  Section  2.1.  Let  us  call  Fx  the  spectral 
distribution  function  for  x,  Fv  for  y,  and  Fxy  their  joint  spectral  distribution 
function.  It  is  not  difficult  to  see 


F.W  =  l2  2  D 

F,y(\)  =  MX)fY(X). 

The  conditions  (2.2.1)  are  not  only  necessary,  but  also  sufficient  for  x  to 
be  obtained  from  y  by  a  linear  transformation.  This  fact  is  proved  in  the 
following  theorem. 


(2.2.2)  THEOREM.  Let  the  stationary  processes  x  and  y  be  stationarily 
correlated.  In  order  that  x  be  obtainable  from  y  by  a  linear  transformation 
with  frequency  response  function  h,  it  is  necessary  and  sufficient  that  the 
spectral  distribution  functions  Fx,  Fv  and  Fxy  satisfy  conditions  (2.2.1). 

Proof.  See  Rozanov  (1967)  p.  36.  B 

Remark.  If  the  stationary  process  y  has  a  spectral  density  /„  then 
a  process  x  which  is  obtainable  from  y  by  a  linear  transformation  with 
frequency  response  function  h  also  has  a  spectral  density,  which  is  given  by 

/*(X)  =  h(\)fy(\)h*(\).  '  (2.2.3) 

Let  us  consider  two  examples  of  linear  models  known  as  moving  average 
processes  and  autoregressive  processes. 

(2.2.4)  EXAMPLE.  Moving  Average  Processes.  A  moving  average 
process  y  —  {  y{n)  :  — oo  <  n  <  oo  }  is  defined  by  the  following  expression 

OO 

y(n)=  X)  (2-2-5) 

j'=— oo 

where  e  =  (e(n)  :  —  oo  <  n  <  oo}  is  a  sequence  of  uncorrelated  random 
vectors  with  covariance  matrix  G  and  {A(j)  :  — oo  <  j  <  oo}  axe  d  X 
d  real  matrices.  Note  that  the  process  y  is  obtained  from  £  by  a  linear 
transformation  with  h(\)  =  A  necessary  and  sufficient 

condition  for  the  series  in  (2.2.5)  to  converge  in  mean  square  is 

f)  IM)IP  <  «. 

j  =  -oo 
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where  ||-||  stands  for  any  norm  of  a  matrix.  It  is  straightforward  to  see  that 
the  process  e  has  spectral  density  function  (27r)-1C?.  From  Equation  (2.2.3), 
we  easily  obtain  the  spectral  density  function  /  for  y, 

m = E  g  f  e  , 

\j=>  — OO  '  '■J— oo  ' 

which  is  a  continuous  function.  Therefore,  the  spectral  density  function  of  a 
moving  average  process  is  absolutely  continuous. 

Alternatively,  suppose  we  are  given  the  stationary  process  y  with  ab¬ 
solutely  continuous  spectral  distribution  function  F  and  spectral  density  f. 
Since  /(X)  is  Hermitian  non-negative,  /(X)  can  be  diagonalized  by  an  unitary 
matrix  f7(X)*: 

/(\)  =  U(\)D(\)IT{\), 

where  Z>(X)  is  a  diagonal  matrix  and  the  elements  of  D  are  real  and  non¬ 
negative.  Define 

(/(X))‘/2  =  U(\)(D(\))i/2U'(\), 

where  (D(X))1/,Z  is  obtained  by  taking  the  non-negative  square  root  of  all  the 
elements  of  jD(X).  Then  (/(X))1^2  is  a  Hermitian  non-negative,  measurable 
function  and 

Z(X)=(/(X))I/i!((/(X))1/2)‘. 

Then  each  element  of  (/(X))1^2  is  square  intcgrablc  and  has  a  Fourier  series 
expansion,  namely, 

(/(x))1/!  =  E  E  iw)ii!  <  °°. 

jsm—OO  j=  —  00 


tSce,  for  example,  Strang  (1976)  pp.  212-213. 
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where 


au)  =  ^  2  *iA^- 


It  follows  from  (2.1.15)  that  the  spectral  representation  of  y  can  be  written 
as 

#(»)  =  f_*  ei"X(/(X))1/SZi{<iX) 

/IT  OO 

ei”x  ^ 

’*  J=— OO 

=  2  AUHn~j)> 

where 


JSB—  OO 


:(n) = L 


einX*i(<fX),  £{  ^(dXjz^dX) }  =  IddX. 


We  see  that  {  €(n)  :  — oo  <  n  <  oo }  is  a  sequence  of  uncorrelated  random 
vectors.  This  shows  that  y  is  a  moving  average  process.  Thus  we  have  proved 
the  following  theorem. 


(2.2.6)  THEOREM.  A  weakly  stationary  process  is  a  moving  average 
process  if  and  only  if  its  spectral  distribution  function  is  absolutely  con¬ 
tinuous. 

Proof.  See  Ronazov  (1967)  pp.  39-42.  B 

The  next  two  theorems  deal  with  stationary  processes  with  absolutely 
continuous  distribution  having  spectral  densities  which  are  polynomials  in 
e-«X.  We  first  state  the  theorem  for  scalar  case. 

(2.2.7)  THEOREM.  If  the  spectral  density  function 

/(X)  =  5Z  ^  °»  iti)  =  7(“i)»  7(9)  7^  0, 

i=-<7 
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where  7 (j)  is  real,  then  it  may  be  represented  in  the  form 


/(>0  =  2^153  a0')e  QC?) is  real»  (2-2-8) 

where  the  polynomial  a0')z*  ^ias  no  zeros  i&  the  open  unit  disc. 

Proof.  We  follow  the  proof  given  in  Hannan  (1970)  pp.  62-63  .  We  consider 

.  M{z)=  ]T  7(j>"y. 

/«■— * 

This  expression  has  2 q  zeros,  counting'each  with  its  appropriate  multiplicity. 
Let  z/c  be  a  zero  whose  absolute  value  is  not  1,  then  z J”1  is  also  a  zero  because 
7 (j)  =  7(— j).  Thus  the  zeros  different  from  1  in  absolute  value  can  be  paired 
(zfc,  z^1).  Let  there  be  r  <  q  such  pairs  of  zeros  taking  each  pair  as  often  as 
its  multiplicity.  Then  there  are  2a  =  2q  —  2 r  zeros  of  absolute  value  1,  say 
et0k ,  k  —  1, . . . ,  2a.  Then 

/(\) = 7(?)<r*'s>'  n  («*  -  -  j;1)  n  (eix  - 

y—  1  fc=i 

=  n  («*  -  *,x«-ix  -  n  ii  -  «"*»• 

i—  1  i—i 

It  follows  that  the  bracked  factor  is  real  and  non-negative.  Moreover,  the 
derivative  of  the  bracked  factor  vanishes  at  X  =  6k  because  it  is  non-negative 
and  zero  at  X  =  0*.  Thus  Ok  occurs  in  pairs.  The  2 q  zeros  can  be  numbered 
and  divide  into  two  sets  (27, . . . ,  zq )  and  [zq+i, . . . ,  Z2q)  such  that  if  \zk\  <  1 
then  zq+k  —  **  1  and  if  | Zk\  =  1  then  zq+k  =  Zk-  Then 

m  =  ir1  n  (e'x  -  **)i2  =  ^iE  “OV’'  i2. 

fc»i  y=o 
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where 


a(g)2  =  2*7(9)  IJ  (2fc  l)  II 

i— 1  fc— 1 

Since  the  coefficients  in  M(z)  axe  real,  the  zeros  arc  real  or  occur  in  conjugate 
pairs.  Then  a(j)  will  be  real.  Q 

Note  that  (2.2.8)  is  the  spectral  density  function  of  a  9th  order  moving 
average  process  (cf.  pp.  12-13).  There  is  a  similar  result  for  the  vector  case. 
Since  the  proof  is  rather  long,  we  omit  it  and  refer  the  interested  reader  to 
Hannan  (1970)  and  Rozanov  (1967).  _ 

(2.2.9)  THEOREM.  A  non-negative  matrix  function 

/(X)  =  J2  rUK‘*>  r(i)  ^  od,  r(j)  =  r\-j), 

jmm-q 

which  has  a  determinant  not  identically  zero,  can  be  represented  in  the  form 

/w = Mi  e  -  (2-2i°) 

V— 0  '  V=o  ' 

where  A(0)  is  Hermitian  non-negative  definite,  A(j)  are  real  and  all  zeros  of 
det(X)j=p  AU)z3)  on  or  outside  the  unit  disc. 

Proof.  See  Hannan  (1970)  pp.  63-66.  fl 

The  factorization  (2.2.10)  leads  to  a  finite  order  moving  average  repre¬ 
sentation  (cf.  pp.  12-13),  namely, 

I/(n)  =  AU)*{n  -  j)»  E{  €(m)€*(n) }  =  5”  Id, 

3=0 

where  =  1  if  m  =  n,  S =  0  otherwise. 

We  now  study  another  important  linear  model  which  is  called  the  autoregres¬ 
sive  process. 
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(2.2.11)  EXAMPLE.  Autoregressive  Processes.  A  stationary  process 
y  =  {  y(n)  oo<n<oo}is  called  a  pth  order  autoregressive  process  if 
it  satisfies 

.  YjB(j)y(n-j)  =  €{n),  B[0)  =  Id,  B{p)f£Od,  (2.2.12) 

i»0 

where  €  =  {e(n)  :  — oo  <  n  <  00}  is  a  process  of  uncorrelated  random 
vectors  with  covariance  matrix  G.  If  y(n)  has  constant  mean,  /x,  then  we 
certainly  have 

i-o 

so  that  (2.2.12)  holds  for  the  new  process  {  y(n)  —  /x  :  —00  <  n  <  00  }  with 
c(n)  —  E{  e(n) }  on  the  right.  Therefore,  we  may  assume  E{  c(n) }  =  0  to 
facilitate  our  discussion.  If  all  zeros  of 


lie  outside  the  unit  disc,  then  a  solution  of  (2.2.12)  exists  and  is  of  the  form 

OO 

y(n)  =  22  AU)4n  -  i)>  ^(°)  =  (2.2.13) 

where  the  ||A(y)||  converges  exponentially  to  zero  as  j  increases  (cf.  Hannan 
(1970)  p.  326).  In  order  to  obtain  the  spectral  density  function  /  of  the  process 
y,  we  observe  that  e  is  obtained  from  y  by  a  linear  transformation  with 
h(\)  =  X^__0 1?(j)e-,jX.  From  Equation  (2.2.3),  we  obtain  h(\)f(\)h*(\)  = 
(27 r)-1G.  Hence 

'•j=0  '  v= 0  ' 
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Multiply  (2.2.12)  by  the  transpose  of  (2.2.13)  with  n  replaced  by  n  —  s  and 
take  expectation  on  both  sides,  wc  obtain 

p  .  OO 

E{  ^2  B(j)y{n  -  j)y'[n  -  s)}  =  E{  e(n)e,(n  -  s  -  k)A\k)} 

3—0  k= 0 

which  leads  to 

p 

^2  B(j)R(s  -  j)  =  6%G,  a  =  0,1,2,...  (2.2.14) 

o 

These  are  often  called  the  Yule- Walker  equations. 
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2.3.  The  Law  of  Large  Numbers  and  Central  Limit  Theorem 


The  study  of  strictly  stationary  processes  is  usually  carried  out  in  the 
context  of  measure-preserving  transformations,  in  this  section  we  shall  follow 
the  development  of  Doob  (1953),  Chapter  10.  Let  ( E,£,P )  be  a  probability 
space;  that  is  to  say,  E  is  a  space  of  points  u,  £  is  a  <r-field  of  subsets  of  E, 
and  P  is  a  probability  measure  on  £. 


(2.3.1)  DEFINITION.  Let  T  be  a  transformation  of  £  onto  itself.  It  is 
called  a  measure-preserving  set  transformation  if  the  following  conditions  are 
satisfied  : 

(Cl)  T  is  single  valued,  modulo  set  of  probability  0:  if  A\  is  an  image  of  A 
under  T,  the  class  of  all  images  of  A  is  the  class  of  all  measurable  sets 
differing  from  Ai  by  sets  of  probability  0. 

(C2)  P{TA)  =  P(A). 

(C3)  Neglecting  u  sets  of  probability  0, 


r(Ai  U  A2)  =  TAt  U  TA2 


T[E  —  A)  =  E—  TA. 


If  every  measurable  set  is  the  image  of  some  measurable  set  under  T, 
this  transformation  must  be  1-1  (neglecting  sets  of  probability  0)  and  the 
inverse  T~l  is  deGned,  and  is  also  a  measure-preserving  set  transformation. 
If  T  is  a  measure-preserving  set  transformation,  there  is  one  and  only  one 
transformation  Ti  defined  for  every  random  variable,  taking  random  variables 
into  random  variables,  and  having  the  following  properties: 
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(PI)  T\  is  single- valued  modulo  the  random  variables  which  vanish  with 
probability  1:  if  xi  is  an  image  of  x  under  T\,  the  class  of  all  images 
of  x  is  the  class  of  all  random  variables  equal  to  Xi  with  probability 
1. 

(P2)  Ti  is  consistent  with  T:  Til  a.  is  1  almost  everywhere  on  TA  and  0 
otherwise. 

(P3)  Ti  is  linear:  if  a,  b  are  constants  and  if  x,  y  are  random  variables, 

Ti(ax  +  by)==aTix  +  bTiy  a.e. 

(P4)  Ti  preserves  convergence:  if 

lim  xn  =  x  a.e., 

n— ►  oo 

then  limn_  ooTixn  =  Tx  a.e. . 

We  shall  use  the  same  notation  T  for  T\. 

(2.3.2)  PROPOSITION.  If  T  is  a  measure-preserving  set  transformation, 
and  if  x  is  a  random  variable,  the  stochastic  process 

{  y{n )  :  n  >  0  },  y(n)  =  Tnx,  (2.3.3) 

is  strictly  stationary,  and,  if  T  has  an  inverse,  the  stochastic  process 

{  y{n) :  -oo  <  n  <  oo  },  j/(n)  =  Tnx, 

is  also  strictly  stationary. 

Proof.  See  Doob  (1953)  pp.  454-455.  Q 


(2.3.4)  PROPOSITION.  Let  {y(n)  :  n  >  0}  be  a  strictly  stationary 
process,  then  there  is  one  and  only  one  measure-preserving  set  transformation 
T  such  that.Tny(0)  =  y(n)  a.e.  for  all  n  >  1. 

Proof.  See  Doob  (1953)  pp.  455-456.  B 

The  above  discussion  and  results  apply  equally  well  to  multi-dimensional 
processes;  the  interested  reader  is  referred  to  Rozanov  (1967)  Chapter  4  for 
an  excellent  treatment. 

(2.3.5)  DEFINITION.  A  measure-preserving  set  transformation  is  called 
metrically  transitive  if  the  only  invariant  random  variables  are  constant  with 
probability  one. 

The  process  defined  by  (2.3.3)  is  metrically  transitive  if  T  is  metrically 
transitive.  We  now  have  the  ergodic  theorem. 

(2.3.6)  THEOREM.  If  {  j/(n) :  n  >  0}  is  strictly  stationary  and  metrically 
transitive  with  E{  |j/j(n)|  }  <  oo  for  j  =  1, . . . ,  d,  then 

1  n_1 

lim  -  V  y(k )  =  E{  y( 0) }  a.e. 
h-*°°nfc=o 

Also  if  E{  yj{n)2  }  <  oo  for  j  =  1, . . . ,  d,  then 


Proof.  See  Hannan  (1970)  p.  203.  B 

We  shall  study  a  weaker  condition  than  metrically  transitive  which  im-  . 
plies  it.  Let  { y(n )  :  n  >  0}  be  a  strictly  stationary  process  defined  on  a 
probability  space  (E,£,P).  For  integers  0  <  a  <  6,  let  Tba  be  the  o-field 
generated  by  x ... ,  Vj(b),  0  <  j  <  d  (with  the  obvious  extension  for 
b  —  oo). 
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(2.3.8)  DEFINITION.  We  say  that  the  process  {  y(n ) :  n  >  0  }  is  <ji-mixing 
if  there  exists  a  non-negative  function  <j>  of  positive  integers  such  that 
limn_  oo  4>{n)  =  0  and  for  each  k  >  0,  and  n  >  1, 

.  sup{  I P(E2  |  E2)  -  P(E2) I :  E2  e  7%, E2  €  T?+n  }  <  n).  (2.3.9) 

Condition  (2.3.9)  simply  says  that  events  concerning  the  ’’future”  of  the 
process  become  almost  independent  of  events  in  the  past. 

Remark  1.  Let  {X(n)  :  n  >  0_}  be  a  stationary  Markov  chain  with 
finite  state  space,  and  let  y(n)  =  /(X(n))  where  /  is  some  real  function 
on  the  state  space.  If  (X(n)  :  n  >  0}  is  irreducible  and  aperiodic,  then 
{  y(n) :  n  >  0  }  is  ^-mixing  (cf.  Billingsley  (1968),  pp.  167-168). 

Remark  2.  If  (X(n)  :  n  >  0}  is  a  Markov  process  with  infinite  state 
space,  then  (y(n)  :  n  >  0}  is  ^-mixing  if  {X(n)  :  n  >  0}  satisfies  Doeblin’s 
condition,  has  one  ergodic  class,  and  is  aperiodic.* 

(2.3.10)  THEOREM.  Suppose  that  the  process  (t/(n)  :  n  >  0}  is  <f>- 
mixing  with  <  oo,  has  covariance  function  R,  and  has  mean 

zero  (E{  y}  —  0.  Then 

v*  Q  £  !'(*=))  =*  N(°’ E)- 

Here  =»  denotes  the  convergence  in  distribution  and  N(0,  X)  is  the  d- 
dimensional  normal  random  vector  with  mean  O  and  covariance  matrix  X, 

OO 

Z  =  K)  =  £  (2.3.11) 

&  =  —  OO 


rScc  Doob  (1953),  p.  190. 


The  elements  of  U  are 


oo  oo 

*ij  =  E{  »mvM  }  +  E  £{  vMvAk) }  +  E  E{  »<(%,-( 0) },  (2.3.12) 

fc  =  l  k  =  l 

and  the  series  converges  absolutely. 

Proof.  See  Billingsley  (1968)  pp.  174-177.  Q 

Let  f  be  the  spectral  density  function  of  the  process  y  in  Theorem 
(2.3.10).  From  this  theorem  we  know  ]T)£L_00|.Riy(&)|  ^  00  f°r  *»  3  ~ 
1, . . . ,  d,  therefore  /  has  Fourier  series- expansion  (cf.  Section  2.1), 

/M  =  ^  E 

i=— oo 

Observe  that  JC  =  -^0)  =  2^/(0).  In  the  next  section  we  will  show 

that  any  continuous  spectral  density  function  can  be  approximated  arbitrarily 
closely  by  the  spectral  density  function  at  zero  of  a  finite  order  autoregressive 
process.  Therefore,  we  can  use  techniques  in  the  time  series  literature  to 
obtain  an  estimate  of  S. 


23 


2.4  Approximation  of  Spectral  Density  Function 


In  this  section  we  shall  demonstrate  that  any  arbitrary  continuous  spectral 
density  function  /  can  be  approximated  by  a  polynomial  in  e-,x  having  the 
following  form 

aw  =  E  my-iSX.  m = n-ii,  m  ^  od.  (2.4.1) 

3—9 

In  Section  2.2,  we  have  seen  that  g  is  the  spectral  density  function  of  a 
finite  order  moving  average  process.  We  shall  also  show  that  /  can  be 
approximated  arbitrarily  closely  by  the  spectral  density  function  of  a  finite 
order  autoregressive  process. 

The  approximation  (2.4.1)  is  based  on  the  Weierstrass  approximation 
theorem  which  states: 


(2.4.2)  THEOREM.  If  k  is  a  continuous  function  of  period  2tt,  then 
corresponding  to  every  positive  number  e  there  exists  a  trigonometric  sum 

n ' 

S(\)  =  a0  +  ^2  (fly  cos  X  /  4-  bj  sin  X  j), 

i— 1 


such  that  the  inequality 


|S(X)  -  fc(X)|  <  6 


(2.4.3) 


is  satisfied  for  all  value  of  X. 


Proof.  See  Achieser  (1956).  B 

The  results  contained  in  the  following  remarks  will  be  used  throughout 
this  section. 

Remark  1.  For  any  continuous  function  /  of  period  27T,  let  k(X)  =  /(X)-f 
8 1  and  let  5"(X)  satisfy  the  Weierstrass  approximation  theorem  (2.4.2)  with  e 
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replaced  by  62  >  0  in  (2.4.3).  Then 

Si  —  82  <  S(X)  —  /(X)  <  Si  +  £3. 

By  properly  selecting  Si  and  62,  we  can  make  the  upper  bound  and  lower 
bound  of  S(X)  —  /(X)  arbitrarily  small. 

Remark  S.  If  k(\)  =  fc(— X)  is  an  even  function,  let  Si(\)  =  + 

S(— X)),  which  is  also  an  even  function.  Then  we  have 

|S,(X)  -  fc(X)|  =  |i(S(X)  -  k(\))  +  i(S(— X)  -  lc(-X))| 

< 

and 

Si(X)  =  i(S(X)  +  S(-X)) 

-  n  .  n 

=  ao  +  olZ  ay(cos  xi  +  cos(-xy))  +  -  2_J  6i(sin  x  3  +  sin(— X  y)) 

1  i-1  ’ 

=  =o  +  E  k(ei/x + =  E  C,.e-‘A 

J=»l  j=*-n 

where  i  =  y/~T,  c0  —  ao,  cy  =  c_y  =  5 ay  are  real. 

Similarly,  if  fc(X)  =  —  k(— X)  is  an  odd  function,  we  may  form  an  odd 
function  S2(X)  =  |(5(X)  -  5(-X)).  Then 

I S2(\)  -  fc(X)|  <  e 


and 


«  «  .  n 

Sj(x)  =  5  E  <Jj(cos  X  j  —  cos(— X,'))  +  k  6y(sin  X/-sin(-X») 


■  =  \  E  -  «'iA)  =  •'  E 

j==l  j——n 

where  do  =  0,  <fy  =  — d_y  =  —  j6y  are  real. 

The  next  theorem,  which  can  be  found  in  most  time  series  literature 
(cf.  Anderson  (1970)  pp.  410-411  ),  is  an  approximation  theorem  for  scalar 
processes. 
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(2.4.4)  THEOREM.  If  a  spectral  density  function  /  is  continuous  then  for 
any  e  >  0  there  is  a  spectral  density  function  of  the  form 

s(X)=  t,  (2.4.5) 

J=-m 

where  7(j)  =  ~t(— j)  is  real,  p(X)  >  e/2  and  |g(X)  —  /(X)|  <  e,  — 7r  <  X  <  rr. 

Proof.  Let  Si  =  3e/4,  82  =  e/4  and  take  g(X)  =  S(X)  as  described  in  Remark 
1.  Then 

e/2  <  g{\)  -  /(X)  <  e, 

and  g(\)  >  e/2  since  /  is  a  spectral  density  function  which  is  non-negative 
for  all  X.  Since  the  spectral  density  function  /  is  an  even  function  (cf.  Section 
2.1),  we  can  take  g(X)  as  (2.4.5)  according  to  the  argument  of  Remark  2.  Q 

(2.4.6)  COROLLARY.  If  a  spectral  density  function  /  is  continuous  then 
for  any  e  >  0  there  is  a  finite  order  moving  average  process  with  positive 
spectral  density,  say  g,  such  that 

|/(X)  —  g(\)\  <  e,  -7r  <  X  <  7T. 

Proof.  By  Theorem  (2.4.4),  /  can  be  approximated  by  a  spectral  density 
function  (2.4.5).  Then  apply  Theorem  (2.2.7),  it  follows  that  g  is  the  spectral 
density  function  of  a  finite  order  moving  average  process.  I 

(2.4.7)  COROLLARY.  If  a  spectral  density  function  /  is  continuous,  then 
for  any  arbitrary  e  >  0,  there  is  an  autoregressive  process  with  spectral 
density  function,  say  h,  such  that  |/(X)  —  h(\)\  <  e,  —i r  <  X  <  tt. 

Proof.  Let  /«(X)  =  /(X)  +  §,  —  7r  <  X  <  tt.  Obviously  /e(X)  is  continuous 
and  positive;  therefore,  the  reciprocal  of  /t(X),  /~1(X),  exists  and  is  also 


continuous  and  positive.  An  application  of  Theorem  (2.1.9)  shows  that  /~2  (X) 
is  the  spectral  density  function  of  a  stationary  process.  By  Corollary  (2.4.6) 
there  is  a  positive  spectral  density  function  of  a  moving  average  process,  say 
g,  such  that  l/TH^)  ~  y(X)|  ^  ~'n  —  ^  w^ere 


. _  1  min  g(\) 

2 6  max  /e(X)‘ 


Then 


|/(X)  -  g-'Ml  <  \  +  |/.(X)  -  J-'WI 


<  5  +  l/.(x)H»-,WH/rIW-»(X)| 


<  £• 


This  proves  the  theorem.  | 


To  prove  similar  theorems  for  vector  processes,  we  shall  begin  with  2- 
dimensional  case  ( d  —  2),  then  extend  the  results  to  d  >  2.  Let  /  be  a 
continuous  2X2  spectral  density  matrix.  Since  /(X)  =  /*(X)  (cf.  Section 
2.1),  we  can  take 


(  /.(X)  P.(X) + .'p2(xn 

\Pi(X)-  ipa(X)  /a(X)  J 


where  /*  and  pk  are  real  continuous  functions.  In  Section  2.1,  we  have  shown 
that  /(X)  ss  /'(— X),  therefore  /*  and  pi.are  even  functions  and  P2  is  an  odd 
function.  Then  we  can  properly  select  £1,  62  and  the  approximation  functions 
9k>  9k>  for  fk  and  pk>  k  =  1,  2  respectively,  such  that  for  e  >  0 


fffc(X)  =  7 fcyc  ,/x,  7fci  =  Ik, -j  is  real, 

y— nfc 

e  <  Afc(X)  =  fffc(X)  -  A(x)  <  2e, 


and 


m  j 


9i(X)  =  ,jX»  dij  =  di,-3 


is  real, 


«/ 4  <  U[\)  =  gi{\)  -  px(X)  <  e/2, 


and 


Put 


where 


mj 


92(X)  =  i  d2jc~xix,  d2j  =  -d2,_y  is  real, 

«/4  <  V(X)  =  S2(X)-p3(X)  <  e/2. 

_  /  «P0  ?l(X)  +  t?s(X)X 

Uw— -«(X)  Js(X)  J 

=  E  =  /W +  •*(*) 


j— m 


^(x) 


=  f  Al(X 

'  W)  -  .1 


’) 


Ai(X)  *7(X)  +  tV(xp 
tV(X)  A2(X) 

m  =  max(rifc,mfc), 

rfcjc(j)  =  Tffcj,  =0,  if  |i|  >  njt,  fc  =  1,2, 

rI2(y)  =  diy  - d2i,  dkj  =  0,  if  1/1  >mk,  ,k  =  l,2, 
r2i(y)  =  diy  +  d2y, 

and  J|A(X)J]  <  3e,  if  we  define  the  norm  to  be  maximum  row  sum.  Clearly, 
r(m)  Od,  J T(y)  is  real  and  Jn/(y)  =  r(—j).  To  show  the  non-negativity  of 
g(X),  we  observe  that  for  any  complex-valued  vector  ot 


ot'  =  a'  +  ib '  =  («i,  a2)  +  t(&i,  b2),  a,-,  b,-  6  5fc,  *  ==  1, 2, 

a*g(X)a£  *  ac*/(X)a  +  ac*A(X)a:, 

>  o:*A(X)a  (by  the  non-negativity  of /(X)) 

=  a|Aj,(X)  +  a\  A2(X)  +  2  a\d2U{\')  +  b\  Ai(X) 

+  b2  A2(X)  +  2bib2£/(X)  +  2(a26i  —  ®ib2)V(X). 

Since  A \t,XJ  and  V  are  positive,  the  worst  case  occurs  when  aio2  <  0,  bib2  < 

0  and  (<z2&i  —  oib2)  <  0.  In  this  case,  we  use  the  following  inequalities: 

Afc(X)  >  e, 

U{\),V{\)  <  e/2. 


We  then  have 

oc  g(X)a:  >  n((ai  "h  a2)2  +  (&i  +  &2)2  +  (a2  +  &i)2  +  (ai  ~  fo)2)  ^  0. 

Thus,  we  have  shown  that  we  can  find  an  approximation  g  which  is  also  a 
spectral  density  function  when  d  =  2.  Now,  assume  d  >  2  and  /  is  a  <£  X  d 
spectral  density  matrix  having  continuous  components.  We  may  obtain  an 
approximation  function  (2.4.1),  say  g,  by  applying  the  following  algorithm. 

(2.4.8)  ALGORITHM.  (Approximation  of  Spectral  Density  Function) 
Al.  [Initialization.]  Let  g^k\\)  =  /(X)  for  k  =  0.  Set  k  =  0,  i  =  1,  and 

y  =  2. 

A2.  [Polynomial  approximation.]  Let 


If  all  components  of  w(\)  are  polynomials  in  e~xX  then  go  to  A4.  Otherwise, 
find  a  2  X  2  matrix  function,  denoted  by  h,  which  approximates  io(X) 
using  the  procedure  described  in  pp.  27-28. 

A3.  [Update.]  Form  g^k+1^  by  substituting  h  for  w  in  g ^  and  update  k  = 
k  + 1. 

A4.  [Done?]  Set  j  =  j  +  1,  if  j  <  d  then  go  to  A2.  Otherwise,  set  t  =  i  +  1. 
If  t  <  d  then  put  j  —  i  +  1  and  go  to  A2. 

A5.  [Obtain  g.]  Put  g(X)  =  g^k\\)-  B 

Note  that  all  g ^  are  spectral  density  functions,  since  they  are  Hermitian 
non-negative  (cf.  Section  2.1).  Note  also  that  |]</fc+1)(X)  —  <7^(X)||  <  3e  for 


all  k.  This  algorithm  will  terminate  in  a  finite  number  of  steps,  M  (  M  < 
d[d  —  l)/2  ).  Step  A5.  puts  g  =  which  is  a  spectral  density  function 
having  the  form  (2.4.1),  and 

for  some  finite  constant  C  [C  <  3 M).  Thus  we  have  established  the  following 
result. 

(2.4.9)  THEOREM.  If  /  is  a  d  X  d  spectral  density  matrix  with  continuous 
components,  for  any  e  >  0  there  is  a  spectral  density  matrix  of  the  form 

y(x)  =  £  r( /)«“•'*.  r(m)  ft  od,  ru )  = 

jwm-m 

and  ||g(X)  —  /(X)||  <  e,  —  it  <  X  < 

(2. 4. 10)  COROLLARY.  Let  /  be  as  described  in  Theorem  (2.4.9),  then 
for  any  e,  e!  >  0,  there  is  a  finite  order  moving  average  process  with  spectral 
density  function  g,  such  that  ||/(X)  —  g(X)||  <  e,— 7r  <  X  <  7r.  Also,  there 
is  a  finite  order  autoregressive  process  with  spectral  density  function  h,  such 
that  ||/(X)  —  Ai(X)||  <  e',  — 7r  <  X  <  7r. 

Proof.  The  first  part  follows  directly  from  Theorem  (2.4.9)  and  Theorem 
(2.2.7).  For  the  second  part  let 

/«(X)  =  /(X)  +  €  •  Id, 

then  every  entry  of  /«(X)  is  continuous  and  /e(X)  is  positive  definite  for  all 
X.  Thus  the  reciprocal  of  /e(X),  fc(X)  =  /71(X),  exists  for  all  X  and  every 
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component  of  fc(X)  is  continuous.  Let 


S(X)  =  (£  -  ^(°)  =  Id,  A})  «  real, 

S'— 0  '  S— o  ' 


such  that  ||S(X)  —  fc(X)||  <  £i,  then 


11/ W-  S-‘(X)||  <  ||/(X)  -  /«(X)||  +  ||/.(X)  -  S-‘(X)|| 

<  £  +  ll/.(X)H  •  l|S_1(X)ll  •  lls(X)  -  fc(X)l|. 

And  the  reciprocal  of  5(X),  5,_1(X),  exists  except  perhaps  for  finitely  many  X 
and  is  the  spectral  density  function  of  an  autoregressive  process.  By  properly 
choosing  £i,  and  letting  h(\)  =  S-1(X),  we  obtain  ||/(X)  —  h(X)||  <  eL  Q 


CHAPTER  m 


THE  AUTOREGRESSIVE  METHOD 
AND  ITS  APPLICATIONS 


Based  on  the  results  of  Chapter  2,  it  is  clear  that  a  continuous  spectral 
density  function  can  be  approximated  arbitrarily  closely  by  the  spectral  den¬ 
sity  function  of  a  finite  order  autoregressive  process.  We  shall  see  that  it  is 
possible  to  use  this  result  to  obtain  consistent  point  estimates  and  asymptoti¬ 
cally  valid  confidence  intervals  for  quantities  of  interest. 

Let  {  y(n)  :  n  >  0  }  be  a  d-dimensional  strictly  stationary  process  which 
is  0-mixing  with  ]C^L=i  <  oo.  We  wish  to  estimate  the  quantity 

r  =  E{  y(n) }. 


From  Theorem  (2.3.6)  and  Theorem  (2.3.10)  we  know 


a.s. 


(3.0.1) 


y/n  (fn  -r)  =>  N(0,1 7), 

where  £  =  X^fc'L-oo  R[k)  and  the  series  converges  absolutely.  Corollary 
(2.4.10)  shows  that  for  any  arbitrarily  small  e  >  0  there  is  a  27*  such  that 
||27  —  .Z7e||  <  €,  and  this  27*  is  27T  times  the  spectral  density  function  at  zero 
of  a  finite  order  autoregressive  process.  Instead  of  estimating  27  directly,  the 
autoregressive  method  is  designed  to  estimate  the  approximation  27*. 


3.1.  The  Autoregressive  Method 


We  consider  the  A:th  order  vector  autoregressive  process  {  x(n) :  n  >  0  } 

which  satisfies 
fc 

EBM x(n-j)  =  £(n),  B{0)  =  Id,  B{k)^  0d,  (3.1.1) 

i-o 

where  the  e(n)  are  i.i.d.  (identically  and  independently  distributed)  with  mean 
zero  and  covariance  matrix  G  and  the  B(j)  are  square  matrices.  The  mean 
zero  assumption  of  e(n)  implies  that  x(n)  has  mean  zero.  This  is  not  always 
the  case.  If  the  mean  /x  of  x(n)  is  not  zero,  we  make  a  mean  correction  (i.e., 
we  replace  x(n)  by  x(n)  —  /x).  If  all  zeros  of  the 


lie  outside  the  unit  disc,  then  a  solution  of  (3.1.1)  exists  (cf.  Section  2.2)  and 
is  of  the  form 

*(n)  =  A0)e(n  ~  A(°)  =  Id‘ 
j~o 

Recall  that  the  spectral  density  function  g  of  {  x(n) :  n  >  0  }  is 

9W=(Eb(j>"'a)  1g(ebw)'"'a)  ■ 

Vj=0  '  V==0  ' 

To  estimate  the  parameters  B(j)  and  G,  it  is  natural  to  use  the  Yule- Walker 
equations,  which  are 
k 

X)  B  U)R  (*  "  J)  =  6M0G,  s  =  0, 1, 2, . . .  (3.1.2) 

;= o 

Since  there  are  k  +  1  unknown  matrices,  namely  5(1), . . .  ,B(k),  and  G,  we 
use  the  first  k  +  1  equations  of  (3.1.2).  By  taking  transpose  on  both  sides  and 
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using  R'(n )  =  J?(— n)  (cf.  Section  2.1),  we  obtain 

Y,B(j)R(j)=G, 

3=0 

k 

53  RU  -  =  -«(-«).  *  =  1. 2,  •  •  M  k. 

i™ 1 

Rewrite  the  equations  in  matrix  form,  we  obtain 
/  JR(0)  R(l)  ...  R(k  —  1)\  / B'(l)\ 

fi(-l)  -R(O)  ...R(k-2)  U'(2)  U(-2) 


(3.1.3) 


\R{l-k)R{2-k)...  R{ 0)  J\B'(k)J  \R{-k)J 

denoted  by  RkBk  =  —»V 

Given  a  sample  of  size  n,  say  {  x(t)  :  *  =  0, . . . ,  n  —  1 },  our  estimation 


equations  become 


a 

CkBk  =  Cfc 


where 


C[m)  =  (n-m)  1  53  x(m  +  o)x'U)t 

3  =  1 

replaces  R  (m)  to  obtain  Ck  from  £*,  ck  from  r*,  and Bk  =  (Bfc(l)>  •  •  •  >Bk{^)) 


is  an  estimate  of  Bk-  Since 


we  estimate  Q  by 


and  g(X)  by 


G  =  E  SOW- A 

3=0 


G*  =  ESt(j)G(-j), 

3=0 

\  — 1  / 


MM  =  (l>0>-7x)  6‘(Es*0>“a)  • 

The  consistency  of B  k{j)  andGfc  are  justified  by  the  following  theorem: 
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(3.1.4)  THEOREM.  If  {  x(n)  :  n  =  0,  1, . . . ,  n  —  1 }  is  generated  by  (3.1.1), 
where  e(n)  and  B(j)  are  as  stated  below  the  equation,  then  Bk,  Ck  and  Gk 
converge  almost  surely  to  B  k,  and  G  respectively  as  n  -*  oo. 

Proof.  See  Hannan  (1970)  pp.  329-332.  Q 

An  alternative  approach  for  the  estimation  of  parameters  is  to  assume 
that  the  c(n)  are  normally  distributed  and  to  use  the  method  of  maximum 
likelihood.  This  does  not  lead  to  the  estimates  just  discussed.  If  n  is  relatively 
large,  there  will  be  little  difference  between  the  maximum  likelihood  estimates 
and  those  derived  from  Equation  (3.1.3)  (cf.  Anderson  (1971)  pp.  183-186). 

We  assume  that  {  y(n)  :  n  >  0  }  is  generated  by  a  fcoth  order  autoregres¬ 
sive  process  with  parameters  Bk0  =  (2?k0(l), . . . ,  ■Bfc0(fco)),  and  Gfc0.  If  fco  is 

A  A 

known,  we  can  obtain  consistent  estimates  Bk0  and  Gk0  from  a  realization. 
Unfortunately,  the  value  ko  is  usually  not  known,  therefore  we  must  estimate 
the  true  order  according  to  certain  order  selection  rules.  We  shall  study  this 
topic  in  detail  later.  Now  let  us  assume  we  have  an  order  selection  criterion 
and  assume  that  there  is  a  finite  constant  K ,  known  a  priori  such  that  ko  < 

A  A 

K  <  oo.  From  Theorem  (3.1.4)  above,  it  follows  directly  thatHfc  and  Gk 
are  consistent  estimates  of  Bk0  and  Gfc0  respectively  for  all  k  >  fco.  So, 
in  order  to  estimate  B  which  appeared  in  (3.0.1)  we  propose  the  following 
autoregressive  method.- 

(3.1.4)  ALGORITHM  (The  Autoregressive  Method) 

Al.  [Which  criterion?]  Select  a  constant  K  which  serves  as  the  maximum 
order  of  the  autoregressive  model,  and  choose  a  criterion  for  order  deter¬ 
mination. 


A2.  [Parameters  estimation.]  Obtain  the  estimates Bk  and  Gk  by  fitting  the 

observations  {  y(0), . . . ,  y(n  —  1) }  to  a  kth  order  autoregressive  process 
for  k  =  0, 

A3.  [Select  order.]  Determine  the  order  according  to  the  selected  criterion. 
We  denote  this  selection  by  k. 

A4.  [Obtain  .57.]  Estimate  £  by  the  quantity 

Mg AiW)”H5AiW)#  ■  ■ 

Justification  for  the  autoregressive  method  appears  in  the  following  sec¬ 
tions. 


3.2  The  Univariate  Autoregressive  Method 

In  this  section  we  shall  study  the  univariate  case  in  detail.  The  variance 
constant  appearing  in  (3.0.1)  is  simply  a  scalar,  which  we  denote  by  a2.  The 
autoregressive  method  yields  an  estimate,  say  s2,  which  is  an  approximation 
of  a2.  We  will  examine  several  order  selection  criteria  for  the  autoregressive 
process  and  derive  asymptotic  properties  for  the  estimates. 

Recall  that  a  fcth  order  autoregressive  process  x  =  { x(n)  :  n  >  0 } 
satisfies 

k 

Y2  PU)x(n  ~  J)  =  c(n),  /3(0)  =  1,  (3(k)  0  (3.2.1) 

J—0 

where  f3(j)'s  are  constants  and  the  e(n)  are  uncorrelated  random  variables 
with  i?{e(n)}  =  0  and  variance  E{e(n)2}  =  <x2(e).  The  spectral  density 
function  fx  of  x  is  given  by 


/.(X)  = 


(3.2.2) 


Let  y  =  (t/(n)  :  n  >  0}  be  a  strictly  stationary  process  which  is 

(^-mixing  with  <t>{n)1/2  <  oo.  Let  the  spectral  representation  of  the 

process  y  be 

y(n)  =  [  e,nXx(dX),  E{  |z(dX)|2  }  =  F{d\). 

First,  we  will  show  that  we  can  find  a  finite  order  autoregressive  process 
x  which  is  close  to  the  process  y  in  the  sense  of  mean  square.  From  the 
discussion  in  Section  2.1,  Equation  (2.1.15),  we  know  that  the  representation 
can  be  modified  if  F  is  absolutely  continuous.  In  that  case,  if  h  is  a  Borel 
measurable  function  satisfying 


IMX)|2  =  ^  =  /(X),  (3.2.3) 
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then  there  is  a  z*  process  with  orthogonal  increments  which  satisfies 


e,nX/i(X)zi(<i\),  •  E{  |*i(<iX)|2  }  =  d\. 


Since  y  is  a  real  process  and  the  series  ^2^=-oo  converges  absolutely  (cf. 
Theorem  (2.3.10)),  /  is  real,  non-negative,  and  continuous  (cf.  p.  23).  Then 
we  may  take  h(\)  =  /(X)  which  is  the  positive  square  root  of  /(X).  From 

Corollary  (2.4.7),  we  know  that  for  given  6  >  0  there  is  a  spectral  density 
function  (3.1.3),  say  g(\)  =  l/|52*_0/3(.7)e“,jX|2,  suc^  ^at  1/00“ &WI  < 
—n  <  X  <  7r.  Let  2  =  {  z(n) :  n  >  0  }  be  defined  as  follows: 


c(n)  =  f  e‘nX\/g(X)zi(dX). 

J  —IT 


Since  g  is  real  and  non-negative  for  all  X,  the  positive  square  root  of  </(X), 
yj g(X),  exists  thus  z(n)  is  well-defined.  The  covariance  function  of  z  is 

E{  z(n  +  m)z*(n)  }  =  [  el<n+rn>x  e~{nX  y/^} d\ 

—IT 


/ 


=  f  e 

'  —If 


t’mX 


p(X)dX, 


which  depends  only  on  m.  This  shows  that  z  is  a  weakly  stationary  process. 
The  processes  z  and  y  are  close  in  the  sense  of  mean  square,  since 

E{  |x(»)  ~  s,(n)|2  }  =  E{  |  f  e("x(%/7(X)  -  v^X))z,(<iX)|2  } 

J  —It 

=  /  l\/7(xj-v/aWI2<ix 

J  —IT 

<  /  lv/7W-\/?WMv/7(x)+v/sWI^ 

•/  —If 

<  /  |/(X)  -  ff(X)|dX 

•/  — ir 

<  27r5. 
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Since  |l/£).— 0  ${j)e  ,,,X|  also  satisfies  (3.2.3),  the  spectral  representation  of 


x  can  also  be  written  as 


Define 


Then 


:(n)  =  [  e*nX  — i - - - a2(dX),  E{  j^2(dX)|2  }  =  dX. 

J-"  (£*-o  fid)***) 

H  PU)x(n  -  i)//5(0). 

.  J-0 


E« :» + »w«)>  -  £  ^o)e'mV<£x = £ 


27r//32(0),  if  m  =  0; 


if  m  =f^  0. 


This  shows  that  the  sequence  e(n)  are  uncorrelated.  Thus,  x  is  a  fcth  order 
autoregressive  process. 

We  now  turn  to  the  problem  of  parameter  estimation.  The  autoregressive 
method  requires  us  to.  estimate  0k{j)  and  o2(e)  for  different  values  of  k.  As 
indicated  in  Section  3.1,  we  solve  Equation  (3.1.3)  to  obtain  the  parameters. 
Since  we  deal  with  scalar  process,  R[—j)  =  R[j)-  Therefore,  Equation  (3.1.3) 


becomes 


'  R(0)  R(  1).  ...i?(fc-l)\ //?*(!)> 

R{  1)  i?(0)  . . .  R{k  —  2)  0k{2) 

•  •  •  • 

•  •  •  • 

•  •  •  • 

<R[k-l)R{k-2)...  R{ 0)  A/?fc(fc), 


rR(iy 
R(  2) 

J?(fc), 


(3.2.4) 


denoted  by  ZkPk  =  — r*.  The  solution  for  (3.2.4)  is  (3k  —  —  and 

o2(c)  =  Pk[j)R{j )•  The  equations  for  the  coefficients  of  the  (fc  +  l)th 


order  autoregressive  model  axe 

■  c  - 

where  qk  =  (R{k), . . . ,  5(1))'  and  (3^  =  (/3fc+1(l), . . . ,  0fc+1(*))'.  The 
partitioned  equations  of  (3.2.5)  are 

£fc/#+i  +  1k/3k+i{k  +  1)  =  -rfc>  .  (3.2.6) 

QkPkh  +  R{O)0k+1{k  +  1)  =  -R{k  +  1).  (3.2.7) 

Elimination  of  P^+x  from  (3.2.6)  and  (3.2.7)  yields 

g L  ru  ,  tiZ^rk-Rfi  +  l)  • 

pk+i{k  +  i)  —  -S.arl?4  . 

We  observe  that  JZkXqk  =  —(pk(k),pk(k  -  1), . .  AU)/*  therefore 

fc 

*(0)  -  -  E  AW*W  =  ^w. 

y-0 


Thus 


Substitution  of  pk+x[k  +  1)  into  (3.2.6)  gives 

A+i(j)  —  AW  +  A(*  +  1  —  j)A+i(*  +  1)»  i  —  1,2,...,  fc. 


*J+ito  = 


k+l 

E 

J-O 


k  sk+1  \ 

=  *(o)  +  E  AOW)  +  + 1)(  E  A(*  + 1  -  /W)J 

=  »!(«)  +  0k+i(k  +  1)(E  «(«=  +  !-  J?iw) 

=.  <^(<0(1  -  J9i+1(*  +  1)). 


To  summarize  this  method,  we  have  the  following  recursive  formulae:  for  each 

k  >  0 

o  n  .  ^  _  Zj=0R(k  +  1  -  fiM) 

0k+1{k  +  1)-  a2(a) 

Pk+i{j)  =  +  Afe(k  +  1"  j)Pk+i{k  +  1),  y  —  1, . . fc,  (3.2.8) 

*A+i(o)-i, 

fffc+i(£)  =  ^fc(e)  *  C1  -  Pl+iik  + 1))» 
with  initial  conditions 


<r&e)  =  R(0),  0o(p)  —  1. 


(3.2.9) 


This  is  a  simple  and  efficient  method  to  compute  parameters  Pk(j)  an<^  °fc(€) 
(cf.  Durbin  (1960)  pp.  139-153).  Given  observations  { 2/(0), . . . ,  y(n  —  1)},  we 
replace  R[n)  by  C{n)  in  (3.2.8)  and  (3.2.9)  to  obtain  estimates  Pk  and.  o2k{e) 
for  Pk  and  <rk(e)  respectively. 

The  problem  of  determining  the  order  for  an  autoregressive  process  has 
gained  much  attention' during  the  past  forty  years.  We  could  trace  interest  in 
this  area  to  as  early  as  the  mid-1940’s.  Quenouille  (1947)  considered  a  method 
for  determining  the  goodness-of-fit  for  autoregressive  model.  Later,  Akaike 
(1969,  1970)  introduced  a  method  by  using  the  concept  of  final  prediction 
error  (FPE).  A  few  years  later,  Akaike  (1974)  proposed  a  new  criterion  called 
Aji  Information  Criterion  ( AJC )  which  has  become  very  popular.  Other 
criteria  have  been  introduced  since  then  are,  for  example,  the  BIC  suggested 
by  Akaike  (1977),  and  Schwarz  (1978),  and  the  h  criteria  by  Hannan  and 
Quinn*  (1979).  In  this  paper,  we  shall  consider  the  AJC,  BIC,  and  h- 

criterion.  We  consider  a  stationary  process,  {  z(n)  :  n  >  0 },  generated  by 

^In  Hannan  and  Quinn’s  paper  they  used  ^  instead.  We  change  it  to  h  to  avoid  confusion 
with  the  ^-mixing  condition  for  stationary  processes. 
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(3.2.1).  The  estimate  of  the  true  order,  which  we  shall  call  ko,  is  obtained  by 

minimizing  one  of  the  following  quantities  : 

AIC(k )  =  nlogal(c)  +  2k, 

BIC[k )  =  nlogo-2(e)  +  fclogn, 

h(k)  =  nloga^(e)  +  2A:cloglogn,  c  >  1, 

where  d£(e)  are  the  estimates  of  a2(c)  obtained  from  the  kth  order  autoregres¬ 
sive  model  based  on  a  sample  of  size  n*.  We  denote  the  selected  order  by  kn ; 
we  will  use  k  whenever  there  is  no  confusion. 


We  now  state  without  proof  certain  theorems  on  the  asymptotic  properties 
of  these  estimates.  The  proofs  may  be  found  in  Shibata  (1976)  and  Hannan 
and  Quinn  (1979).  We  assume  that 
(Cl)  E*Lo /3(jW  °>  \x\  <  1;  E{  e(m)e(n)  }  =  5JJlo-2(e). 

(C2)  (c(n)  :  n  >  0}  consists  of  independent  random  variables  with  the 
same  normal  distribution  N{ 0,  cr2(e)). 

(C3)  The  true  order  ko  is  bounded  above  by  some  constant  K  <  oo  which 
is  known  a  priori. 


(3.2.10)  THEOREM.  Under  the  conditions  described  above  the  asymptotic 

A 

distribution  of  kn  selected  by  AIC  is  given  by 

t-fco  '  QK-k,  ko  <  k  <  K; 

otherwise, 


where 


lim  P(k 


n  =  k)  =  lPk  fc° 

lo. 


Pn  = 


ri  +2r2-f--”+nrn=s=»n  S’= 
r*  >  0,  integers 


=  e  (n  h 

ri+2rjH — +nrnasn  \'=1  J  J 


r i  >  0,  integers 


^Unless  it  is  important  to  have  the  subscripts  n,  we  shall  drop  them  for  convenience. 
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and  a,-  =  P(x2(t)  >  2t),  t  =  1  Here,  X2(*)  is  a  random  variable 

having  the  Chi-square  distribution  with  t  degree  of  freedom. 

Proof.  See  Shibata  (1976)  pp.  119-120.  Q 

The  next  theorem  holds  under  a  weaker  condition  :  we  may  replace  (C2) 

by 

(D2)  -E{c(n)  |  7n-i}  =  0,  E{e(n )2  j  7n-\)  =  <r2 ,  E{e(n )4}  <  oo,  where 
7n  is  the  o-field  generated  by  y(m),  m  <  n. 

(3.2.11)  THEOREM.  Under  the  conditions  (Cl),  (D2),  (C3)  and  E{  |c(n)|r  }  < 
oo,  for  some  r  >  4,  the  estimates  kn  obtained  via  BIC{k )  >r  h(k )  are  strongly 
consistent. 

Proof.  See  Hannan  and  Quinn  (1979)  pp.  192-193.  B 

Besides  the  order  selection  criteria  discussed  above,  one  may  resort  to  a 
statistical  test(cf.  Anderson  (1971)  p.  215,  Fishman  (1978)  p.  251).  We  test 
the  null  hypothesis  Hq:  autoregressive  processes  of  order  k  <  K  against  the 
alternative  Hi :  autoregressive  process  of  order  K.  The  statistic  for  testing  is 


which  has  a  limiting  x2-distribution  as  n  — ►  oo  with  K—k  degrees  of  freedom 
when  the  null  hypothesis  is  true.  We  reject  the  null  hypothesis  if 

Tk-„  >  xi-JK  -  k) 

where  Xi-aC^  ~  fc)  is  the  (1  —  a)th  quantile  of  the  x2-distribution  with  K  —  k 

A 

degrees  of  freedom.  We  select  the  order  k  to  be  the  first  k,  k  =  0, 1, . . . ,  K, 
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such  that  Hq  is  accepted.  For  sufficiently  large  n,  this  test  has  approximate 
significance  level  a.  Other  types  of  statistical  test  will  not  be  considered  in 
this  paper;  the  interested  reader  is  referred  to  Anderson  (1971),  Qucnouille 
(1947),  Grenander  and  Rosenblatt  (1957). 

The  variance  constant  obtained  by  the  autoregressive  method  is 


1  1 

2*  {Zj-ohur 


To  prove  the  consistency  of  S|,  let  us  define  the  AT-dimensional  vectors  (3, 

A 

(3  k  as  follows: 

0  =  (/?(1),  /3(2), . . . ,  p(ko),  0, . . . ,  0), 

K  =  (Ac(l),  &  (2), , .  .  • ,  0k(k),  0, . . . ,  0),  k  =  l,...,K, 

where  Pk{j)'s  are  the  estimates  of  the  coefficients  of  the  kth  order  autoregres¬ 
sive  process.  It  should  be  noted  that/§fc  and  a2(e)  are  consistent  for  k  >  ko 
(cf.  Anderson  (1970)  pp.  188-200).  Let 


S[x,y) 


.  V 

(1  +  x'e )2  ’ 


,i)e%K,xe9iK)yevi. 


We  observe  that 


s2  = 


a2(c) 


2*l£jLo«i)ls 


and  s2  =  S(fik>  is  an  estimate  of  s2  obtained  from  the  fcth  order 

autoregressive  model.  Since  5  is  a  continuous  function,  by  using  a  continuous 
mapping  argument  (cf.  Billingsley  (1968)  pp.  30-31),  we  know  that  s2  is  a 
consistent  estimate  of  a2  for  k  >  ko.  If  k  is  obtained  by  minimizing  BIC{k) 
or  h(k),  the  consistency  of  s|  follows  immediately  from  the  strong  consistency 


44 


A  A 

of  the  estimate  k.  If  k  is  obtained  via  minimizing  AIC(k),  then  for  any  S  > 

0,  e  >  0 

P{  l»i  -  »JI  >  e }  =  f;  p{  |3?  -  .’I  >  s,  k  =  k } 

=  Ep{i3i-»!I>«|£  =  *}--P{fc  =  fc} 

Jfc—0 

1 

From  Theorem  (3.2.10)  and  the  consistency  of  S2  for  k  >  ko,  for  sufficiently 
large  sample  size  M  we  have 


=  »»<* 
fc<fco  z 

^{l3J-*2|  >«}  <  |,  k0<k<K. 

Thus 

•p{i>!-.si>«}<  |  y,  p{*  =  *}  + ; 

fc>  Jko 

<  € 

This  shows  that  is  a  consistent  estimate  of  s2. 
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3.3.  The  Multidimensional  Autoregressive  Method 


For  the  vector  case,  we  first  derive  recursive  formulae  for  obtaining 
parameters  of  the  fcth  order  autoregressive  model,  Bk  and  Gk  (cf.  Section 
3.1).  To  solve 

=  (3.3.1) 


recursively  requires  an  auxiliary  equation  (cf.  Lee  (1980)) 

f  12(0)  12(1)  ...R{k-1)\{  A'k(k)  \  f  R{k)  \ 


12 ( — 1)  12(0)  . . .  R(k  —  2) 


\R[l-k)R(2-k) ...  12(0)  J 


Ai(A:-l) 


v  A'k( i)  ; 


R(k  —  1) 


,  (3.3.2) 


\  12(1)  J 


denoted  by  RkAk  =  —  qk.  Note  that  Equation  (3.3.2)  is  equivalent  to 


Y2  Ak(j)R(s  -k  +  j)  =  Od,  8  =  0, 1, . . . ,  k  -  1, 
i—  o 


(3.3.3) 


with  Ak( 0)  —  Id-  If  we  define 

£k[n)  =  Ak(k)x(n)  H - +  Afc(l)x(n  —  k  +  1)  +  Ajt(0)x(n  —  k ) 


=  5Z  Afc(i)x(n  -  k  +  j), 

3-0 


then  Equation  (3.3.3)  implies 


E{  £fc(n}x'(n  -  s) }  =  Od,  a  =  0, 1, .. . ,  k  -  1.  (3.3.4) 


Let  Hk  denote  the  covariance  matrix  of  £k(n),  then 

fc 

Hk  =  E{  £fc(n)&(n) }  =  E{  ^  Ak{j)x(n  -  k  +  j')£'fc(n) } 

3-0 

=  E{  x(n  -  k)$’k{n) }  (by  (3.3.4)) 

=  T  R(-j)KU). 


Equations  (3.3.1)  and  (3.3.2)  can  be  solved  in  an  efficient  recursive 
(cf.  Kailath  (1974),  Wiggins  and  Robinson  (1965)).  The  equations 
coefficients  of  autoregressive  model  of  order  ( k  +  1)  are 


manner 


for  the 


qk  V  -^i+i  ' 

(R(  0)  Y^4+i(*  +  !)' 

V  rk  %kJ \  -Afc+l 


R{k  +  1  y 


(3.3.5) 


(3.3.6) 


where 


bCJ i  =  (Biti(l),  Bkfl( 2) . Bk+1(k))', 

■Afc+i  —  (-^fc+i(k)j  -Afc+ i(fc  —  1)»  •  •  •  t-^fc+i(^))  • 
The  partitioned  equations  for  (3.3.5)  and  (3.3.6)  are 

+  qkB!k+l(k  +  1)  =  -rk, 

q'k^kU  +  «(o)Bi+1(*  + 1)  =  -K(-fc  - 1). 

and 

^(0K+1(fc.+  l)  +  rUSi  =  -«(*  +  !), 

rJtAA:+l(fc  +  1)  +  ik^hll  =  “9*r- 


(3.3.7) 

(3.3.8) 


(3.3.9) 

(3.3.10) 


The  solutions  for  (3.3.1)  and  (3.3.2)  are  B k  =  —  %k  1r/e  and  Ak  =  —  %k  1gjt 
respectively.  Substituting  them  into  Equations  (3.3.7) — (3.3.10),  we  obtain 

B[1li  =  Bk  +  AkB'k+l(k  +  l), 

+  1)  =  -(g'kAk  +  R(0))-‘(R(-t  -  1)  +  rfkBk), 

■^1+x  —  +  BkAk+l(k  +  1), 

A't+i(<=  +  1)  =  -(B(0)  +  r'kBky\R(k  +  1)  +  riAO, 

k 

and  BMR(-])  =  B(0)  + 

y-o 

k 

Hk=Y  Ak(j)R  (y)  =  il( 0)  + 


I 


Note  that  Hk  is  the  covariance  matrix  of  £fc(n),  hence  is  symmetric  and 
positive  definite. 

We  recursively  define 

k 

Ak+i  =  ^2  Bk{j)R  {k  +  1-  j) 

3=0 

pk+1  =  G;1/2Ak+1(H:1/2)'t 

with  initial  conditions 

G0  =  H0  =  R{0),  .  Ak{ 0)  =  Bk{ 0)  =  Id. 

Then 

Gjfc+i  =  Gk  +  Ak+iB'lc+1(k  +  1) 

=  Gk(l—  pfc+ip'fc+i), 

=  (aJ  - 

Hk+ x  =  Hk  +  Ak+i(k  +  l)2lfc+1 
=  Hk{l—  Pfc+iPfc+i), 

Equations  (3.3.11) — (3.3.14)  give  a  simple  way  to  compute  the  parameters 

of  a  given  order  autoregressive  process. 

We  assume  that  {  y{n)  :  n  >  0  }  is  generated  by  a  fcoth  order  autoregres¬ 
sive  process  with  parameters  Bko  =  ( Bko(l ), ,  Bko(ko ))  and  Gko.  Given  a 
sample  of  size  n,  we  use  Equations  (3.3.11)-(3.3.14)  to  compute  the  parameters 
for  different  orders  and  wc  adopt  Akaike’s  criterion  (cf.  Akaike  (1974))  to 
select  the  order.  As  in  the  scalar  case,  we  assume  ko  <  K  <  oo,  where  K 
is  known  a  priori.  We  select  the  order  kn  which  minimizes 


(3.3.11) 


(3.3.12) 


(3.3.13) 


(3.3.14) 


AIC(k)  =  n  log|Gfc|  +  2 kd2, 


where  |*|  denotes  the  determinant  of  a  matrix  and  G k  is  the  estimate  of 
Gk0  obtained  from  the  kth  order  autoregressive  process.  We  would  like  to 

A 

establish  some  asymptotic  properties  for  kn.  Before  we  can  proceed  we  need 
to  prove  the  following  lemma. 

(3.3.15)  LEMMA.  If  the  matrix  I  —  AA  is  positive  definite,  then 

0  <  \I~AA!\  <1  if  0d. 

Proof.  Let  X  be  an  eigenvalue  of  I  —  AA!  and  x  be  the  corresponding 
eigenvector,  then 

(I  —  AA!)x  =  \x. 

Multiply  both  sides  by  x'  to  obtain 

(1  —  Xjx'a:  —  x'AA'x  >  0. 

Because  x  is  an  eigenvector,  x  is' not  the  zero  vector.  This  implies  X  <  1, 
and  since  every  eigenvalue  of  a  positive  definite  matrix  is  positive  we  have 
0  <  X  <  1.  It  is  well  known  that 

|i-AA'|  =  nX„ 

t 

where  Xt-  are  the  eigenvalues  of  I  —  AA! .  If  \I—  AA!\  —  1  then  we  must  have 
X,-  =  1  for  all  i,  and  this  means  I  —  AA1  =  I  or  A  =  £?<*.  Q 

(3.3.16)  THEOREM.  The  asymptotic  distribution  of  kn  has  the  following 
property: 

lim  P{kn  =  k}  =  0,  if  k  <  fc0. 
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Proof.  Using  Equation  (3.3.13)  we  obtain 

|G/c+i|  =  |Gfc|  •  \I- pfc+ip'fc+1| 

=  |6„|  •  i(i-  />,#> yi-  ••!(/- pk+ip'k+ x)i. 

and  from  the  consistency  of  the  estimates  for  parameters  of  the  autoregressive 


process  of  order  kko,  we  have  in  probability 

lim  pkcp'ko  =  pk,p'ko  =  G-^L\AkaBfka{ka}G-k^\. 

n  oo 

Obviously,  Gk0-i  =£  0&  and  since  the  true  order  is  fco,  we  know  that 

Bk0{ko)  7^  °d ■  Also  note  that  Bko(k0)  =  -H^^A^,  this  implies  Ako  7^ 
0d.  So,  limn— oo  Pfc0Pfc0  7^  °d  in  probability. 


For  0  <  k  <  ko,  wc  have 

fel  (  TT  I  r _ 1  >  1  ^ 

\Gko\  ')  ~  \I-PkoP'ko\ 

Since  every  I  —  p,p(  is  positive  definite,  by  Lemma  (3.3.15) 


|J  —  p.Pil  <  1*  Therefore,  for  any  e  >  0,  there  exists  a  5  >  0  and  an  integer 


M  >  0,  such  that  for  any  n  >  M 

exp(2(fco  —  k)d2 /n)  <1+5, 
P{\Gk\/\Gko\  <  1  +  5}  <e. 

A 

From  the  definition  of  k,  we  have 

P{k  =  k}  =P{AIC{k)  <  AIC(m),  0  <  m<  K} 

<  P{  AIC(k)  <  AIC{k0 ) } 

=  P{|Gt|/|Gko|  <  exp(2(fc0  ~k)d2/n)} 

<  e, 

this  proves  the  theorem.  Q 


The  estimate  of  the  covariance  matrix  S  appearing  in  (3.0.1)  is  given  by 

£=(eW)  We^w)  • 

\7=0  '  S=0  ' 

A 

The  consistency  of  S  can  be  shown  in  a  manner  analogous  to  that  used  in 
the  last  paragraph  of  Section  3.2  by  properly  modifying  the  function  S. 


3.4.  Applications  to  Markov  Processes 

We  now  apply  the  results  derived  in  previous  sections  to  Markov  processes. 
Let  ( E ,  £ )  be  a  measurable  space. 

(3.4.1)  DEFINITION.  A  function  P  :  [E,£)  —  [0,1]  is  said  to  be  a 
probability  transition  function  if  : 

(a)  for  each  x  €  E,  P(x,  •)  is  a  probability  measure  on  ( E ,  £ ), 

(b)  for  each  B  6  £}P{:,B)  is  a  measurable  function  with  respect  to  £. 

The  n-step  probability  transition  functions  are  defined  by  setting  Pl(x,  B ) 
P(x,  B )  and 

Pn+1[x,B)  —  [  P'iy,B)P{x,dy). 

J  E 

Let  E°°  =  E  X  E  X  and  £°°  =  £  X  £  X  ••••  For  any  w  = 
(u/0,  wi, . . . )  6  E°°,  let  X(t)(w)  =  wf.  Then  given  P  and  an  initial  probability 
distribution  /z,  there  is  a  probability  measure  on  ( E°° ,  £°° )  such  that  for 
all  n  >  0  and  Bq,  B\, . . . ,  5n  E 

Pp{X(0)  €  Bo,X(l)  . ,X(n)  6  5n}  = 

[  n{dxo)  [  P(xo,  dxi)-  -  ■  I  P(xn-i,dxn). 

J  Bo  J  Bi  "  &n 

It  can  be  shown  that 

PM{ X(n  +  1)  G  B  |  X(0), . . . ,  X(n) }  =  {  X(n  +  1)  6  B  |  X(n) }.  (3.4.2) 

Equation  (3.4.2)  is  called  the  Markov  property  and  (X(n) :  n  >  0}  is  said  to 
be  a  Markov  process  with  state  space  E,  initial  distribution  /z  and  stationary 
probability  transition  function  P. 

For  our  simulation  studies  we  assume  that  X(n)  =>  X  as  n  — oo 
where  X  has  .stationary  distribution  7r.  It  is  known  that  if  we  take  the  initial 
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distribution  to  be  7r  then  the  process  (X(n)  :  n  >  0}  is  a  strictly  stationary 
process.  If  we  make  the  further  assumption  that  (X(n)  :  n  >  0}  is  <f>- 
mixing  with.  4>{n)xf2  <  oo,  then  Theorem  (2.3.10)  holds.  Therefore,  we 
can  apply  the  autoregressive  method  to  this  stationary  process  and  obtain 
reasonable  estimates  about  the  steady  state  quantities.  But  usually  we  do 
not  know  the  stationary  distribution  7r  in  real  situation,  otherwise  we  could 
compute  the  results  analytically  and  would  have  no  need  to  simulate.  So  we 
choose  Jf(0)  according  to  an  initial  distribution  /z,  and  simulate  the  process. 
This  procedure  yields  a  non-station  ary  process.  Since  this  non-stationary 
process  converges  in  distribution  to  a  stationary  random  variable,  the  process 
is  asymptotically  stationary.  In  order  to  apply  the  autoregressive  method  and 
obtain  consistent  estimates,  we  need  to  justify  the  method  for  non-stationary 
processes. 

Let  {-X"(n)  :  n  >  0}  be  a  Markov  process  with  initial  distribution  7r 
which  satisfies  the  regularity  conditions  for  Theorem  (2.3.10).  Then 

V*  Q  £(*(■')  -  **))  =*■  N(°- 
'7l  «'=0  ' 

The  autoregressive  method  says  that  for  e  >  0,  there  exists  a  such  that 

||27  —  ^7e||  <  e  and  that  from  a  sample  of  size  n  we  can  obtain  a  consistent 
^  _ 

estimate  Sn  for  under  the  stationary  distribution  7r.  Now,  let  /x  be  any 
initial  distribution  which  is  absolutely  continuous  with  respect  to  tt,  then  the 
Radon-Nikodym  derivative  exists  (cf.  Halmos  1950).  We  also  assume  that 
there  is  a  constant  C,  such  that  <  C  <  oo.  For  any  6  >  0,  let 

An(6)  =  {u>eE:\pn(u)-Z(\\>5}, 
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then  limn^oo  P»{  An(^) }  =  0.  Thus 

lim  P„{^n(«)}=  Urn  f  Px{An(6)}M(Jx) 

n— *oo  n—*ooJE 

=  lim  [  Px{An(6)}~n(dx) 

n— ►  oo  J  E  ax 

<  lim  C-P„{  >!„(«)  }  =  0. 

n— ►  oo 

* 

So,  also  converges  in  probability  to  Ut  under  PM.  Therefore,  we  have 
found  some  conditions  which  allow  us  to  apply  the  autoregressive  method  to 
simulation  output  data. 

Remark  1.  If  {X(n) :  n  >  0}  is  a  finite  state  space  irreducible  Markov 
chain  then  3^  <  C  always  holds  for  any  [x .  For  countable  state  space,  3^  < 
C  holds  if  fi  has  finite  support. 

Remark  2.  If  x  has  an  atom  x,  then  fi  =  Sx  satisfies  3^  <  C. 

It  is  often  the  case  that  we  need  to  study  the  steady-state  behavior  of  a 
continuous  time  Markov  chain.  Since  the  process  is  simulated  in  continuous 
time,  we  need  a  technique  which  converts  the  continuous  time  process  to  a 
discrete  time  series  in  order  to  apply  the  autoregressive  method.  This  can 
be  done  by  either  sampling  the  continuous  time  Markov  chain  (then  we  need 
to  consider  the  problem  of  what  can  be  inferred  about  the  full  process  from 
the  sample)  or  by  using  the  discrete  time  method  (cf.  Hordijk,  Iglehart,  and 
Schassberger  (1976)).  We  shall  briefly  review  this  method  below. 

Let  {X(t)  :  t  >  0}  be  a  continuous  time  Markov  chain  with  countable 
state  space  E.  Assume  the  Markov  chain  is  irreducible  and  positive  recurrent. 
Then  X(t)  =>  X  as  t  — >  00.  Let  the  probability  transition  function  be 

ph M  =  P{x(‘)  =  3 1  x{o)  =  i}, 
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and  let 


Qij  —  jjJPi j(0  |t»o> 


Q  =  {qij  :  i,j  6  E},  and  q{  =  —qa  (assume  0  <  g,  <  oo).  The  quantities 
Qij>  h  3  €  are  called  the  infinitesimal  parameters  of  the  process.  Let  ir  = 
{ 7r,-  :  t  E  E  }  be  the  stationary  distribution  of  the  process.  Then  iJQ  —  0, 


or  equivalently 


n'q'i  =  °»  3  €  E. 

i(zE 


Define  the  matrix  R  —  {  :  i,  j  6  E}  by 


fO,  if  *  =  3\ 

3  if  *  7^  3- 


(3.4.4) 


Let  /  :  E  — *•  JR.  We  are  interested  in  estimating 

r  =  £{/(X)  }  =  £/(■>,. 

ieE 

We  use  the  embedded  jump  chain  (X(n)  :  n  >  0}  which  is  a  Markov  chain 
with  probability  transition  function  R.  Let  X(0)  be  choosen  according  to 
the  stationary  distribution  p  of  the  embedded  jump  chain.  We  define  a  new 
function  g  for  (X(n)  :  n  >  0}  as  follows  : 

»w  =  /m?,-1,  •'  e  e. 


Define 

u(n)  =  g(X(n)), 

v[n)  =  qx\n),  (3.4.5) 

w(n)  =  u(n)  —  r  •  v(n). 

Since  {  X(n)  :  n  >  0  }  is  strictly  stationary,  so  are  the  processes  u  =  {  u(n )  : 
n  >  0},  v  =  {u(n)  :  n  >  0},  and  w  =  {?z/(n)  :  n  >  0}.  Note  that 
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p'R  =  p' ,  or  we  may  write 


which  implies 


E  Virij  =  Pj>  &>r  all  i  € 

i€E 


E  lq{3 =  °* 

ieE 


Since  the  stationary  distribution  is  unique  we  must  have  ptgt-  1  =  ex,-  for 
some  constant  c.  Therefore, 


E{  n(0) }  _  /(-^(Q))g>r(0)  } 

IRo)}-  ^{?x(0)} 

__  E,€g  /(Qg»~APf 

Eygfi  Pi 

=  X)  /(*)*<  =  r* 


(3.4.6) 


»'€S 

Thus  t//(n)  has  mean  zero.  Applying  Theorem  (2.3.10)  to  process  to  yields,  as 
n  — ►  oo. 


^/nii)  v/rz((xl/T5)  —  r*) 


iV(0, 1). 


(3.4.7) 


<r  a/v 

By  using  a  continuous  mapping  argument  we  may  replace  v  by  E{  t/(0) }  and 
Equation  (3.4.7)  becomes 

y/n(u/v  —  r) 


a/E{  v{0) } 


N(0, 1). 


The  variance  constant  cr2  is 


*2=  E  *»(*)» 


(3.4.8) 


fcss-OO 

and  by  the  definition  of  w,  we  have 


Rw{k)  =  ./?«(*)  -  2rRuv{k)  +  r2Rv{k). 
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Therefore 


oo  oo  oo 

a2=  RM-2r  J2  Ruv{k)  +  r 2  ^  *v(*)-  (3.4.9) 

fc=  — oo  fc=  — oo  k  =  —  oo 

Here,  Rw[k),  Ru(k),  and  Rv(k)  axe  the  covariance  functions  of  the  processes 
to,  ti,  and  v  respectively,  and  RUv{k)  is  the  cross-covariance  function  of 
the  processes  ii  and  i>.  Thus,  we  have  two  alternatives  to  implement  the 
discrete  time  method.  The  first  alternative  is  observe  the  processes  u  and 
v,  form  {w(n)  :  n  >  0}  by  replacing  rn  for  r  then  apply  the  univariate 
autoregressive  method  and  use  Equation  (3.4.8).  The  other  alternative  is  to 
use  the  vector  process  y  =  {  y(n)  :  n  >  0  }  defined  by  y(n)  =  (u(n),  v(n))7 
directly  and  obtain  an  estimate  of  cr2  by  using  Equation  (3.4.9)  through  the 
spectral  density  function  of  y. 

To  apply  the  autoregressive  method  to  semi-Markov  processes  simulated 
in  continuous  time,  we  first  apply  the  discrete  time  method  (cf.  Hordijk, 
Iglehart,  and  Schassberger  (1976)).  Then  we  either  use  Equation  (3.4.7)  or 
(3.4.8)  to  calculate  <x2. 


CHAPTER  IV 


VARIANCE  REDUCTION  TECHNIQUES 


Although  simulation  is  an  important  tool  for  analyzing  stochastic  sys¬ 
tems,  in  many  practical  applications  considerable  computer  time  is  required 
for  simulation  runs.  Therefore,  it  is 'desirable  to  develop  methods  that  al¬ 
low  us  to  obtain,  based  on  the  same  realizations,  improved  statistical  ac¬ 
curacy.  Such  methods  are  called  variance  reduction  techniques.  We  shall 
develop  several  variance  reduction  techniques  and  incorporate  them  into  our 
autoregressive  method. 

Throughout  this  chapter,  we  let  (X(n)  :  n  >  0}  be  a  strictly  stationary 
process.  Since  X(n )  has  the  same  distribution  for  all  n,  it  holds  trivially  that 
X(n)  =>  X  as  n  — ►  oo.  The  quantity  of  interest  is 

r  =  E{  1(X)  },  (4.0.1) 

where  /  :  E  — *■  3?  is  a  real-valued  measurable  function.  We  make  the  further 
assumption  that  (X(n) :  to  >  0}  is  0-mixing  with  Hn^C71)^2  <  00 •  Then 

{  E  /(*(•')) 

*=o 

is  a  strongly  consistent  unbiased  estimate  for  r  and  the  following  central  limit 
theorem  holds  for  ?(n)  (cf.  Section  2.33)  : 

V^fw  =>  N{ 0, 1),  (4.0.2) 

G 
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where 


?2  =  Var{  /(X(  0))  }  +  2  f)  Cov{  /(X(t)),  /(X(0))  }. 

1  =  1 

Our  goal  is  to  find  another  strongly  consistent  estimate  for  r  and  a  central 
limit  theorem  analogous  to  (4.0.2)  with  a  smaller  variance  constant  a2.  The 
motivation  for  doing  so  is  to  be  able  to  form  a  shorter  confidence  interval  for 
r. 

4.1.  Control  Variables  Method 

In  this  section  we  will  discuss  the  use  of  control  variables  to  achieve 
variance  reduction  in  the  simulation  and  hence  obtain  a  shorter  confidence 
interval.  A  good  introduction  to  this  technique  is  given  in  the  book  by  Gaver 
and  Thompson  (1973)  pp.  582-591.  Detailed  accounts  of  various  kinds  of 
control  variables  applications  can  be  found  in  Iglehart  and  Lewis  (1979), 
Lavenberg,  Moeller,  and  Welch  (1977),  and  Gaver  and  Shedler  (1971).  A 
control  variable  is  a  random  variable  whose  expectation  is  known  and  which 
is  correlated  with  the  process  under  study. 

Let  a  sequence  of  processes  {  Cj[n)  :  n  >  0  },  j  =  1, 2, . . . ,  k,  which  will 
serve  as  the  control  variables,  having  the  following  properties: 

(Pi)  They  are  fairly  easy  to  obtain;  i.e.,  we  do  not  spend  too  much  time 
generating  them. 

(P2)  They  are  correlated  with  the  original  process  (JC(n)  :  n  >  0}. 

(P3)  The  mean  E{  Cj(n) }  =  /xy  is  known  or  can  be  calculated  analytically. 

Define  a  (k  +  1)  dimensional  column  vector 

y(n)  =  (y0(n),  yx(n), . . . ,  yk{n))' 

=  (f(X{nj),  Ci(n)t...,Ck{n))',  n  >  0, 
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and  assume  that  the  process  {  y(n)  :  n  >  0 }  is  strictly  stationary  and  (f> - 
mixing  with  J2n  ^(n)1//2  <  oo.  Then  we  have  the  following  results  (cf.  Section 
2.3): 

1  n~l 

Vn  =  -  V]  y( *')  -+  A*  a.s.  and 

Vn(y„  -  a*)  =►  iV(0,  27),  (4.1.1) 

where 

A»  =  (r,/xi,...,/ifc)', 

27  =  (<y«j)>  *iJ  —  0, 1, . . . ,  k, 

and  ffij  is  defined  by  Equation  (2.3.12).  Now  let  (3  be  a  (k  +  1)  dimensional 
column  vector  of  .real  numbers.  If  we  take  /3  =  (l,/?i ,...,/?&)'  and  form 
y/?(n)  =  /(p£T(n))  +  DjLi  “  A*;),  then 

W»)  =  ~  Z( m<))  +  E  MW)  -  N 

«*=cA  j=i 

is  an  estimate  of  r.  We  have  rp(n)  — *  r  a.s.,  and  a  simple  application  of  the 
continuous  mapping  theorem  (cf.  Billingsley  (1968))  yields: 

0,1),  (4.1.2) 

where  Op  =  /3'27/3.  Note  that  Equation  (4.1.2)  can  be  written  as 

(?/?(n)  “  r) 

- =>  N{ 0, 1).  (4.1.3) 

op 

Since  there  is  no  restriction  in  selecting  the  (3j’s,  j  =  1, . . . ,  k,  we  pick 
{3  =  (3*  —  (1,  (3\, . . . ,  /?£)'  where  (3 *  minimizes  Op.  This  will  produce  the 
smallest  possible  confidence  interval  for  r.  To  minimize  o\  we  need  to  solve 


(4.1.4) 
(<701,  •  •*,cr0fc)/, 

minimize  b'Ab  +  2a'b.  (4.1.5) 

It  is  easily  seen  that  the  optimal  solution  for  (4.1.5)  is  b *  =  — A-1a.  We 
denote  the  corresponding  /3  by  /3*,  then 

tTp.  =  cqo  —  2a'A~1a  +  (A~1a),A(A~la) 

=  a0Q  —  a!  A~l  a. 

Since  A  is  the  covariance  matrix  of  the  control  variables,  it  is  positive  definite 
and  thus  so  is  A-1.  Hence,  we  reduce  the  variance  by  a  positive  amount 
a’ A~ la.  Since  the  covariance  matrix  is  not  known,  it  becomes  necessary  to 

*  a 

estimate  S.  Ifi7n  is  a  strong  estimate  of  S,  then  an  — ►  a  a.s.  andA„  — ►  A 
a.s.  as  n  — *•  oo.  Hence,  A"1  — ►  A""1  a.s..  Let  bn  =  —A~la'n,  it  is  clear 
that  bn  —*  b  a.s.  as  n  — ►  oo.  Then  ry(n)  and  cry  are  strongly  consistent 
estimates  of  r  and  <7p-  respectively. 

It  follows  that  for  0  <  7  <  1/2,  the  100(1  —  27)%  confidence  interval 


the  following  non-linear  programming  problem: 

minimize  c^  =  (3'EP 

subject  to  /30  =  1. 

Let  b  ( 0i  1  •  ■  • ,  @k)  ,  -A.  {  *">  3  ==  1,  •  •  • ,  k  and  a 


then 


-<«(':•  DC) 


=  Coo  +  2  a' b  +  b'Ab. 
Therefore,  the  problem  (4.1.4)  becomes 


for  r  is 


where  z\ — r  =  4>-1(l— 7)  and  <!>(•)  is  the  distribution  fun  tion  of  the  standard 
normal  random  variable. 

This  variance  reduction  technique  is  also  applicable  to  continuous  time 
Markov  chains  and  semi-Markov  processes.  We  first  apply  the  discrete  time 
method  discussed  in  Section  3.4  and  then  the  control  variables  method. 
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4.2.  Multiple  Estimates  Method 

The  multiple  estimates  method  of  variance  reduction  was  introduced  by 
Hcidelberger  (1977)  for  regenerative  Markov  processes.  In  this  section  we 
shall  slightly  modify  the  method  to  adapt  it  to  our  situation.  Here  (X(n)  : 
n  >  0  }  is  a  strictly  stationary  <f>- mixing  process. 

The  multiple  estimates  for  r  are  formed  by  choosing  new  measurable 
functions  fj’.E  — *■  such  that  E{  fj{X) }  =  r  for  1  <  j  <  k.  Assuming 

|/J(J1T(0J)| }  <  co,  then 

fi(n)  =  /j'(X(’))  —  r  a.s. 

71 1=0 

Each  fj[n)  is  a  strongly  consistent  estimate  for  r.  Define  a  (fc  +  l)-dimensional 
vector  y  as 

«(n)  =  !/lM,  •  •  • ,  y*(n)) 

=  (f(X{n)),  /x(X(n)), . . . ,  A(X(n)))’,  n  >  0. 

It  is  not  difficult  to  see  that  the.  process  y  =  (y(n)  :  n  >  0}  is  strictly 

stationary.  If  Nba  is.  the  tr-field  generated  by  X(a),  . ..,  X(b)  and  if 

is  the  a-field  generated  by  yj[a), . . .  ,yj{b),  0  <  j  <  k,  then  M £  C  Mb. 

Since  {X(n)  :  n  >  0}  is  ^mixing  it  follows  that  y  is  ^-mixing.  Thus 

Equation  (4.1.1)  holds  with  /u  =  (r,  r, . . . ,  r)'.  Let  /3  =  (/3q,/3i,  . . . ,  /3k)'  so 

that  Y^=o  Pj  =  1  and  form 

k 

*p{n)  = 

y=o 

We  have  rp(n)  — *•  r  a.s.  and 

V™  (f/?(n)  ~  r) 

- - =>  N{ 0, 1),  (4.2.1) 


G2 


where  a\  —  =  YJl,j=o PiVijPj- 

To  maximize  the  variance  reduction  we  solve  the  non-linear  programming 
problem: 

minimize  Op  =  P  £($ 

subject  to  /3'e  =  1, 

where  e  =  (1, 1, . . . ,  1)'.  The  optimal  solution  is 

0'  =  Z-'/e'Z-'e, 

=  l/e'S-'-e. 

To  apply  the  multiple  estimates  method,  we  need  to  find  the  proper 
functions  /y’s.  We  shall  discuss  the  choices  for  Markov  processes.  Let 
{  X(n)  :  n  >  0  }  be  a  discrete  time  Markov  process  with  probability  transition 
function  P.  One  choice  suggested  by  Heidelberger  (1977)  is  setting 

/o  =  /, 

U  =  Pjf,  j  >  i, 


(4.2.2) 


which  is  defined  by 

//(*)  =  f  Pjix>  dy)f{y)- 

J  E 

Obviously,  fj  =  Pfj- 1,  j  >  1.  It  can  be  shown  that  7r/  =  tt(P /).  By 
induction  it  follows  that 


rj  =  tt/j  =  7r(^/y-i) 

=  *//-i  =  •  •  •  —  r. 

This  idea  is  also  applicable  to  continuous  time  Markov  chain  (X(£)  : 
t  >  0}  with  discrete  state  space.  We  first  apply  the  discrete  time  method 
discussed  in  Section  3.4.  Let  (X(n)  :  n  >  0}  be  the  stationary  embedded 
jump  chain.  For  j  =  1, . . . ,  k  we  define 

0/(0  =  /i(0<7.rl»  *  €  ^ 

w>(n)  =  0/Wn)) 

i»(n)  =  ?x(n) 

lu^n)  =  uj(n)  ~  rv(n) 
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and 


3 


K.' 


1 


N 


to(n)  =  (u/0(n), . . . ,  ty/c(n))/ 
u (n)  =  (u0(n), . .  • ,  tifc(n))'. 

{tw(n)  :  n  >  0}  is  strictly  stationary  and  ^-mixing.  For  0  <  j  <  fc,  we 
have  ry  =  i?{  «y(n)  }/i?{  v(n)  }  (cf.  Section  3.4).  Thus  tu(n)  has  mean  zero, 


?y(n)  =  ~  — *  r  a.s.  and, 


v 


L/V 

where  Zw  =  X3yL-oo  -Ru »(j)»  By  using  a  continuous  mapping  argument  we 
may  replace  u  by  £^{u(0)}, 

s/n  (f(n)  -  /*) 


(4.2.3) 


l/£{v(0)} 

where  f(n)  =  (fo(n), . . . ,  ffc(n).  By  applying  the  continuous  mapping  theorem 
again,  we  immediately  obtain  £  —  £W/E{  v(0)  }2. 

To  select  /y  we  recall  that  (cf.  Section  3.4) 

7t'R  =  7T7, 

where  7r  is  the  stationary  distribution  of  the  process  {X(t)  •  t  >  0}  and  i? 
is  defined  by  Equation  (3.4.4).  Therefore,  7 r7/  =  tt'R  /.  This  suggests  we  let 
/o  =  /  and  choose 

fj  =  Rjf  j  >  1. 

Again  7r,/y  =  r  for  all  j. 


64 


4.3.  Innovation  Control  Method 

The  third  variance  reduction  technique  we  will  discuss  is  the  innovation 
control.  This  method  utilizes  the  so-called  Wold  decomposition  of  the  sta¬ 
tionary  process.  For  the  purpose  of  variance  reduction  we  need  only  discuss 
the  scalar  case. 

Following  Anderson  (1971),  we  use  the  double-infinite  sequence  of  ran¬ 
dom  variables  {  x(n)  :  — oo  <  n  <  oo  }  to  generate  the  relevent  Hilbert  space. 
Let  .Mn  be  the  closed  subspace  spanned  by  x(m),  m  <  n.  Thus  Aln  contains 
all  finite  linear  combinations,  Yljes  aU)xU)>  where  S  is  a  finite  set  of  integers 
which  are  less  than  or  equal  to  n,  as  well  as  their  limits  in  mean  square.  If  x 
and  y  are  two  elements  of  this  Hilbert  space,  then  E{  xy  }  is  called  the  inner 
product.  Clearly,  >tm  C  <Mn,  m  <  n.  We  put  M_oo  =  D^L-oo  an^ 
-Moo  —  M.  The  best  linear  prediction  of  x(n)  by  x(n  —  1),  x(n  —  2), . . .  is 
the  projection  of  x(n)  on  denoted  by  x(n).  Put  e(n)  =  x(n)  —  x(n), 

then  e(n)  Atn_i,  i.e.,  e(n)  is  orthogonal  to  every  element  in  .Mn_i.  The 
random  sequence  e(n)  is  usually  called  the  innovation.  If  E{  e(n)2  }  =  o2  = 
0,  the  process  is  said  to  be  purely  deterministic.  If  E{e(n)2  }  =  a1  >  0,  the 
process  is  called  regular.  The  Wold  decomposition,  Wold  (1954),  clarifies  the 
structure  of  a  stationary  process. 

(4.3.1)  WOLD  DECOMPOSITION  THEOREM.  If  (x(n)  :  -oo  < 
n  <  oo  }  is  a  regular  stationary  stochastic  process  with  E{  x(n)}  =  0,  it  can 
be  written  as 

OO 

x(n )  =  ^  a{j)c{n  -  j)  +  v(n )  =  u[n )  -f  v[n)  (4.3.2) 

3=0 
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where  J2jLoaU)2  <  °°>  fl(°)  =  £{e(n)}  =  &{v(n)}  =  °>  £(n)  £  -M» 

and  v{n)  €  -M—oo-  The  sequences  (c(n)  :  — co  <  n  <  00}  and  {v(n)  :  —00  < 
n  <  00  }  are  unique. 

Proof.  See  Anderson  (1971)  pp.  420-421.  B 

The  next  theorem  states  the  spectral  functions  of  u(n )  and  v[n) 


(4.3.3)  THEOREM.  If  z(n)  is  regular  with  spectral  distribution  function 
F  =  Fae  +  Fa  +  Fd,  and  /  is  the  derivative  of  Fae,  then  /  is  the  spectral 
density  function  of  u(n)  and  Fs  +  Fd  is  the  spectral  distribution  function  of 
v(n).  Furthermore, 


j—0 


Proof.  See  Hannan  (1970)pp.  140-141.  Q 

Now  we  apply  these  theorems  to  obtain  some  variance  reduction.  Let 
{-X’(n)  :  n  >  0}  be  a  strictly  stationary  and  (^-mixing  process.  We  wish 
to  estimate  r  =  E{g{X)Y  for  a  given  function  g.  Let  x(n )  =  g(X(nfj  and 
assume  the  process  {  x(n )  :  n  >  0}  is  regular.  Applying  Theorem  (4.3.1) 
yields 

OO 

z(n)  =  ^2  a(y)e(n  -  j)  +  v(n). 

j=0 


Let 


Define 


x(n)  =  ^2  aU)c(n  ~  j)  +  v(n)> 

3— 1 

e(n)  =  x(n)  —  x(n),  and  a2  =  E{  e(n)2  }. 
y[n)  =  (*(n),  e(n))'. 


^We  reserve  /  for  the  spectra!  density  function. 


Then  { ■y(n)  :  n  >  0  }  is  strictty  stationary  and  (^-mixing,  therefore 

1  n_1 

—  y(i )  — *  ix  a.s.,  and 

71 1=0 

A  (-  X)  -*•)=*  JV{£%  S), 

|=0  ' 

where 

M  =  (r.O)' 

OO 

?  =  E  *(»)-{"«}• 

n  =»— oo 

Here  oyy,  f,  j  =  1,  2  is  defined  by  Equation  (2.3.13).  Let  -F(X)  be  the 
spectral  distribution  matrix  of  y(n)  and  /(X)  =  djPac(X)/dX.  Then  R(n)  = 
etn*F(d\).  Note  that 

oo  oo  rir 

ff.i=  53  *u(»)  =  E  /  «'”X-Pn(«») 

*/  —  7T 

n=— oo  n=— oo 

=  2tt/ii(0)  +  c  =  <r2|  53  a(j)|2  +  c, 

i=o 

where  c  is  the  infinite  sum  of  the  integral  from  the  discrete  and  singular  parts 
of  -FU(X).  And 

OO 

*12  =  53  Cov{  x(n),  e(0) } 

n=— oo 

oo  oo 

=  53  Cov{  23  a0')e(n  -  *) + v(n)»  e(°) } 

n=— oo  3—0 

=  <72  53  a(n)» 

n=0 

oo 

<722  =  53  C(nM°)  }  =  <72- 

n=-oo 

As  we  have  done  in  Section  4.1,  we  take  /3  =  (1,  6)'  and  form  (3'y(n).  Then 


»/>(")  =£eV  WO) +  *W) 
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t  =0 


is  an  estimate  of  r  and  rp(n)  — ►  r  a.s.  as  n  — ►  oo.  Also  the  central  limit 
theorem  holds: 

y/n{rp(n)  -  r)/ap  =>  AT(0, 1)  as  n  — >  oo 

where  =  <ru  +  26<ri2  +  62o"22*  Again,  we  may  choose  (3*  to  minimize  o-p 
which  is 

f  —  (1.  -ovifavtf  —  (l,  ~ 

n=0 

and  (r - — :  (rjj  ^*12 22  —  c* 

We  notice  that  if  the  process  { x(n)  :  n  >  0 }  has  absolutely  continuous 
spectral  distribution  then  crj".  =  c  =  0  that  means  r^-(n)  is  a  constant 
random  variable  r. 

Although  this  method  seems  extremely  good,  we  encounter  difficulties  in . 
application.  In  general,  we  are  not  able  to  obtain  the  innovation  e(n),  unless 
we  know  the  true  values  of  the  parameters  (e.g.,  a(j)’s,  . . . )  for  {  x(n)  :  n  > 


CHAPTER  V 


NUMERICAL  EXAMPLES 


In  this  chapter,  we  present  four  examples  to  demonstrate  how  well  our 
method  performs.  One  of  the  examples,  the  outflow  process  in  a  lake  model, 
is  actually  a  first  order  autoregressive  process.  All  other  examples  come  from 
the  area  of  queueing  theory.  They  are  the  waiting  time  process  in  an  M /M/1 
queue,  the  passage  time  and  response  time  processes  in  a  closed  network  of 
queues,  and  the  queue  length  process  in  a  cyclic  queue.  These  processes  are 
regenerative  processes  for  which  we  can  calculate  the  theoretical  values  of  the 
parameters  being  estimated.  Therefore  we  are  able  to  make  the  comparison 
between  theoretical  values  and  simulation  estimates.  For  all  examples,  the 
results  obtained  from  the  autoregressive  method  are  quite  satisfactory. 

To  determine  the  order  of  autoregressive  model,  we  use  AIC,  BIC,  h- 
criterion,  and  the  statistical  test  discussed  in  Section  3.2  for  the  univariate 
case,  and  AIC  for  multivariate  autoregressive  method.  Among  different 
criteria,  the  results  obtained  by  using  the  AIC  are  usually  the  closest  ones  to 
true  values.  We  may  conclude  that  AIC  is  the  best  one  for  many  applications. 

All  problems  were  run  on  a  DEC-20  computer  and  we  used  the  build-in 
uniform  random  number  generator  to  simulate  all  processes. 
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5.1.  EXAMPLE.  Waiting  Time  Process  in  an  M/M/1  QueiM? 

Suppose  customers  arrive  at  a  single  service  facility.  If  a  customer  finds 
the  server  idle,  then  he  receives  service  immediately,  otherwise,  he  waits  his 
turn  to  be  served.  Assume  the  zeroth  customer  arrives  at  time  Iq  =  0,  finds 
a  free  server,  and  experiences  a  service  time  v(0).  The  nth  customer  arrives 
at  time  tn  and  experiences  a  service  time  t/(n).  Let  the  interarrival  times 
u(n )  —  tn  —  tn- 1,  where  n  >  1.  Also  assume  the  two  sequences  {u(n)  : 
n  >  1 }  and  { t/(n)  :  n  >  0 }  each  consists  of  i.i.d.  random  variables  and 
are  themselves  independent.  Let  E{v(0)}  =  /i-1,  E{u(l)}  =  X-1,  and  the 
traffic  intensity  p  =  \/p.  Let  W(n)  be  the  waiting  time  of  the  nth  customer. 
Then  W(n)  can  be  defined  recursively  by 

W(0)  =  0, 

.  W(n)  =  (  W(n  -  1)  +  X(n)  ]+  (5.1.1) 

=  max{0,  W(n  —  1)  +  X(n)},  n  >  1, 

where  X(n)  =  v(n  —  1)  —  it(n).  It  is  known  that  if  p  <  1  then  there  exists 
a  random  variable  W  such  that  W(n)  ^  W  as  n  — ►  oo.  This  model  is 
commonly  called  the  GI/Gjl  queue.  If  the  arrivals  form  a  Poission  process 
with  rate  X  and  service  times  are  exponentially  distributed  with  rate  p.,  then 
the  queue  is  called  an  M/M/1  queue.  We  are  interested  in  estimating  E(W), 
which  is  finite  if  Z?{u(n)2  }  <  oo. 

We  observe  the  {  W{n)  :  n  >  0  }  process  for  the  univariate  autoregressive 
method.  For  the  variance  reduction  techniques,  we  apply  the  control  variables 
and  multiple  estimates  methods.  It  is  natural  to  use  the  service  time  and 
intcrarrival  time  as  control  variables;  i.c.,  we  use  (W(n),  u(n),  v(n  —  1))  for 
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the  control  variables  method.  We  use  the  column  vector 


»(»)  =  (fo(W(n)),  fl(W(n)),  h{W(n)))\  (5.1.2) 

for  the  multiple  estimates  method.  Here,  the  /  function  is  /(x)  =  x,  fj(x)  — 
Pjf,  3  >  0.  In  order  to  calculate  fj,  we  need  to  find  the  probability  transition 
function  for  the  (W(n)}  process.  For  the  M /M /I  queue  it  is  easy  to  show 
that 

)<  *}  = 

Thus  g(x)  =  ^P{  X(n)  <  x  }  exists  for  all  x  and  we  write  P{  X{n )  E  dx  }  = 
g{x)dx.  Now  to  evaluate  /i(x)  we  have 


( y+7ieXx»  for  x  <  °; 

U  -X+IIe_#xa:»  for  x  >  0. 


/*oo  /*oo 

/i(x)  =  /  P(x,  dy)/(y)  =  /  yP{  W(n  +  1)  6  dy  |  W(n) 
./o  •'0 

»oo  /*oo 

=  /  yP{x  +X(n  +  1)  €  dy}  =  /  yg{y-x)dy. 
Jo  Jo 


=  *} 


We  find 


A(x)  =  x  +  +  .  ..  **— 


X/z  X(X  +  /x) 
To  evaluate  /2(x)  we  compute  the  integral 


/2(x)=  [  P{x,dy)/i(y). 

jo 


After  this  computation  we  find 


/a(z)  =  /i(z)  + 


It  is  possible  to  calculate  exactly  the  covariance  matrix  X  for  y(n)  in 
(5.1.2),  so  that  we  can  compute  the  theoretical  values  of  variance  reductions 
for  the  multiple  estimates  method.  The  idea  is  to  use  the  stationary  process 


{W(n)  :  n  >  0},  i.e.,  VV^O)  is  distributed  according  to  its  stationary  dis¬ 
tribution.  Define  the  generating  function  of  the  joint  Laplace  transform  of 
the  stationary  waiting  time  process  by 


C[zu  22,  s)=  Y^,Ei  «-*lMr(°)-**w(n) }  •  an, 

n=0 

where  |s|  <  1.  C(zi,Z2,s)  has  been  calculated  by  Blomqvist  (1967),  he  also 
gave  the  exact  form  for  the  covariance  function  of  {  W(n)  :  n  >  0  }  which  is 


R{k)  =  Cov{  W{k),  V7(0) } 

P2  P2  ,^3  V(! +P)2/  *!(-*  —  2)!  '  v  J 

We  can  obtain  <r,y  from  (7(21,22,3),  e.g. 

OO 

<ru  =  <r2{  W(0) }  +  2  £  <7ov{  VK(n),  W(0) }, 

n=  1 

oo 

a12  =  <r2  {  W(0) }  +  2  ^  Cov{  W(0),  W{n) } 

n==  1 

+  £  CoH  vv>(0)’  e'W("> }  +  £  c<w{  w{n)’  e"XH,(0)  }l 


and 


£  m w«)}  =  Jim  3)  U.-0...-0, 

n=0  3  12 

fj  Co»{  VK(0)e-w<”)  }  =  -  Jim 


*1  =0,za=*X 


n=0 


After  we  have  calculated  Oij,  we  can  obtain  c^.  by  using  Equation  (4.2.2). 

The  exact  form  of  the  covariance  function  R  allows  us  to  demonstrate  the 
autoregressive  approximation.  Recall  that  the  autoregressive  method  enables 
us  to  approximate  a2  (cf.  Equation  (4.0.2))  by  27T  times  the  spectral  density 


function  at  0  of  some  finite  order  autoregressive  process  (cf.  Section  3.1).  If 
we  assume  that  {  W{n)  :  n  >  0  }  is  a  pth  order  order  autoregressive  process, 
then  {  W(n)  :  n  >  0  }  satisfies 

E  wm*  -  i)  =  £(").  »M  =  1.  /5p(p)  ^  0, 

3—0 

where  e(n)  are  i.i.d.  random  variables  with  variance  <72(£).  We  solve  the 
following  Yule-Walker  equations  (cf.  Section  2.2),  namely 

p 

M)R(S  ~  i)  r=  °»  «  =  !,•••,  P, 

to  obtain  /5p(i)’s  and  c|(e)  =  Y?j=o  /?p(j)i?(— j).  The  corresponding  ap¬ 
proximation  for  <r2  is  27r/fc(0)  =  ffp(e)/IS3y=o  ^p0)|2,  Table  1  contains  the 
values  of  R[k)  for  A:  <  10,  it  also  gives  27r/k(0)  and  simulation  results  for 
27r/fc(0)  for  k  <  10.  From  this  table  we  can  see  that  27r/fc(0)  converges  fairly 
fast  to  a2  —  27r/oo(0)  (the  stationary  waiting  time  process  is  an  infinite  order 
autoregressive  process).  Observe  that  27r/i(0)  is  23.99  which  is  83%  of  a2 
(2^700(0)  =  29).  If  we  want  an  estimate  with  accuracy  90%  of  the  true 
value,  we  may  pick  the  order  of  the  autoregressive  model  to  be  as  low  as  3. 

To  see  how  well  the  autoregressive  method  performs  in  an  actual  simula¬ 
tion,  we  have  taken  fi  =  1,  p  =  0.5.  To  obtain  a  100(1  —  2-7)%  confidence  in¬ 
terval  with  half  length  1005%  of  the  true  value  of  E{  W  }  requires  z\ _ ,<7/ y/n  = 

6  •  E{  W}  for  some  7.  If  we  take  7  =  0.05,  6  =  0.1  then  the  number  of  cus¬ 
tomers,  n,  that  need  to  be  simulated  is  roughly  7850  or  3900  regenerative 
cycles.  For  the  purpose  of  comparing  the  autoregressive  method  and  the 
regenerative  method,  we  simulated  the  waiting  time  process  in  cycles  for  a 
total  run  length  of  4000  cycles  (the  expected  total  number  of  customers  is 
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8000).  We  consider  500,  1000,  2000,  3000,  4000  cycles,  the  longer  runs  being 
continuations  of  the  shorter  runs. 

All  runs  were  replicated  30  times.  For  each  replication  we  form  point 
estimates  and  confidence  intervals  for  various  parameters  of  interest.  We 
then  average  over  the  30  independent  replications  and  form  90%  confidence 
intervals  for  each  parameter;  this  is  done  by  using  the  central  limit  theorem 
for  i.i.d.  random  variables.  The  upper  bound  provided  for  order  selection  was 
10  for  all  criteria.  Table  2  shows  the  point  estimate  for  E{  W }  either  with 
or  without  a  variance  reduction  technique.  Table  3  contains  the  estimate  for 
a2  from  the  regenerative  method  as  well  as  the  approximation  of  a2  from 
the  autoregressive  method  for  various  kinds  of  order  selection  criteria.  By 
inspecting  the  results,  we  conclude  that  among  the  order  selection  criteria, 
AIC  seems  to  be  the  most  satisfactory  one  for  {W(n)  :  n  >  0}.  Table 
5  gives  the  average  order  of  autoregressive  model  determined  by  various 
order  selection  criteria.  We  notice  that  every  criterion  yielded  a  low  order 
autoregressive  model,  this  indicates  the  choice  of  the  upper  bound  K  —  10  is 
quite  sufficient.  In  Tables  4  and  7  we  report  the  coverage  probability  defined 


number  of  90%  confidence  intervals  covering  E{  W } 


s  total  number  of  confidence  intervals  formed 

which  has  expected  value  0.9. 

Table  6  contains  the  results  of  a|.  for  different  variance  reduction  tech¬ 
niques.  Here  we  estimated  the  optimal  multipliers  /?*  by  using  equations 
derived  in  Chapter  4.  From  Table  6  we  see  that  by  using  3  different  /  func¬ 
tions  for  the  multiple  estimates,  we  get  a  substantial  amount  of  variance 
reduction.  In  order  to  judge  a  variance  reduction  technique,  we  must  make  a 


fair  comparison.  Suppose  we  simulate  n\  customers  using  no  variance  reduc¬ 
tion  teclinique  (method  1)  and  ri2  customers  using  multiple  estimates  with  3 
/  functions  (method  2).  In  order  to  obtain  the  same  statistical  accuracy  (i.e. 
same  half  lengths  of  confidence  intervals)  we  have 

<7l  (72 

%1  —a  , -  —  Zl—a  , - > 

vAu  V™2 

that  is  equivalent  to  saying 

_2 

_  rci 

Q  • 

O  2  n2 

From  Table  6  we  find  that  for  method  2  we  can  cut  the  run  length  to  1/30 
of  the  run  length  of  method  1  and  still  obtain  the  same  statistical  accuracy. 
Of  course,  we  recognize  that  by  using  method  2  we  do  a  certain  amount  of 
extra  work.  Since  the  computing  time  depends  on  the  run  length  as  well  as 
the  upper  bound  selected  for  the  autoregressive  method,  it  is  hard  to  find 
the  relation  of  times  between  these  two  methods.  However,  based  on  our 
results,  the  half  length  of  confidence  interval  constructed  by  method  2  is  30 
(=  29/. 948)  times  smaller  than  that  of  method  1  but  requires  6  times  (for 
example,  take  the  last  row  of  Table  8,  6  «  (110.96  +  372.16)/(43.61  +  39.74)) 
as  much  CPU  time. 
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TABLE  1 

Point  Estimates  and  90%  Confidence  Intervals  for  27t/(0) 

Using  Various  Orders  of  k  in  the  Autoregressive  Method  for 
the  Waiting  Time  Process  in  an  M/M/1  Queue  with  p  =  1.0,  p  =  0.5 


Order 

True  Values 

Simulation  Results  for  different  number  of  cycles 

k 

ESI 

EES 1 

.  500 

1000 

2000 

mm 

4000 

U 

0.778 

23.99 

23.92 

±3.66 

24.47 

±  3.03 

24.48 

±2.77 

24.20 

±  1.95 

2 

.0.617 

25.56 

m 

-25.37 

±3.99 

26.27 

±3.44 

26.20 

±3.16 

25.82 

±2.28 

3 

0.497 

26.50 

m 

26.05 

±4.15 

26.90 

±3.71 

26.94 

±3.46 

26.64 

±2.51 

D 

0.403 

27.10 

26.79 

±4.53 

27.70 

±4.11 

27.69 

±3.61 

27.32 

±2.65 

5 

0.330 

27.57 

31.35 

±8.56 

1 

28.06 

±4.32 

28.26 

±3.81 

27.83 

±2.81 

6 

0.272 

27.86 

31.05 

±8.66 

27.30 

±5.09 

28.16 

±4.55 

28.66 

±4.04 

28.13 

±2.95 

D 

0.225 

28.12 

31.87 

±8.99 

28.05 

±5.39 

28.39 

±4.56 

m 

28.48 

±3.08 

8 

0.187 

28.28 

33.13 

±9.77 

28.88 

±5.83 

28.85 

±4.72 

29.45 

±4.39 

28.78 

±3.19 

9 

0.157 

28.46 

1 

| 

28.92 

±5.84 

29.22 

±4.70 

29.59 

±4.41 

28.94 

±3.23 

10 

0.131 

28.51 

32.93 

±9.78 

29.07 

±6.02 

29.06 

±4.66 

29.86 

±4.60 

29.13 

±3.31 

oo 

0 

■ 

— 

— 

— 

— 

— 

•  Results  are  based  on  30  independent  replications,  p(k)  =  R(k)/R{ 0),  R( 0)  = 
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TABLE  2 


Simulation  Results  for  E{W}  =  1.0  in  an  M /M/1  Queue 
with  j.i  =  1.0,  p  =  0.5,  Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

No  Variance 

Control 

Multiple  Estimates 

Cycles 

Reduction 

Variables 

2  functions 

3  functions 

500 

1.0039 

0.9868 

0.9823 

0.9821 

±0.0610 

±  0.0379 

±  0.0160 

±  0.0082 

1000 

0.9901 

0.9812 

0.9861 

0.9928 

±0.0428 

±  0.0227 

±  0.0130 

±  0.0067 

2000 

0.9942 

0.9953 

0.9964 

0.9962 

±0.0316 

±0.0171 

±  0.0101 

±  0.0050 

3000 

0.9978 

0.9938 

0.9953 

0.9951 

±0.0250 

±0.0160 

±  0.0080 

±  0.0040 

4000 

0.9949 

0.9980 

1.0003 

0.9982 

±0.0189 

±0.0118 

±  0.0069 

±  0.0032 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  10. 


TABLE  3 


Simulation  Results  for  a2  =  29.0  in  an  M/M/ 1  Queue 
with  fx  =  1.0,  p  =0.5,  Point  Estimates  and  90%  Confidence  Intervals 


TABLE  4 


•  Simulation  Results  for  Coverage  Probability  (=  0.9) 
in  an  M/ M/l  Queue  with  p,  =  1.0,  p  =  0.5, 
Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Cycles 

Regenerative 

Method  ‘ 

Univariate  Autoregressive  Method 

AIC 

BIG 

h 

Stat.  Test 

500 

0.70 

±0.14 

0.67 

±  0.14 

0.67 

±0.14 

0.67 

±0.14 

1000 

1 

| 

1 

mm 

1 

1 

■apm 

2000 

0,80 

±0.12 

0.80 

±0.12 

0.80 

±0.12 

0.80 

±0.12 

0.80 

±0.12 

3000 

0.80 

±0.12 

•  0.80 

±0.12 

0.80 

±0.12 

4000 

El 

0.90 

±0.09 

0.83 

±0.11 

0.90 

±0.09 

0.90 

±0.09 

TABLE  5 


Order  of  Autoregressive  Model  Selected  by  Different  Criteria 
in  an  M/M/1  Queue  with  p.  =  1.0,  p  =0.5, 

Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Cycles 

Univariate  Autoregressive  Method 

AIC 

BIC 

h 

Stat.  Test 

500 

2.37 

±0.56 

MKBM 

1000 

3.30 

±0.83 

n 

M 

2000 

3000 

3.83 

±0.81 

■BBS 

1 

2.23 

±0.51 

2.00 

±0.53 

4000 

3.90 

±0.75 

1.47 

±0.22 

2.14 

±0.47 

1.73 

±0.41 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Use  the  scalar  process  (W(n)  :  n  >  0}. 

•  The  maximum  order  for  the  autoregressive  model  is  K  —  10. 
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TABLE  6 


Point  Estimates  and  90%  Confidence  Intervals  for  a2  by  Applying 
Variance  Reduction  Technique  in  an  M/M/1  Queue  with  p  =  1.0,  p  =0.5 


Multivariate 

Method 

True 

Value 

Simulation  results  for  different  number  of  cycles 

500 

1000 

2000 

3000 

4000 

Control 

Variables 

— 

12.57 

±2.12 

13.36 

±  11.89 

13.92 

±2.14 

Multiple 
Estimates(2  /* s) 

4.23 

3.67 

±1.65 

m 

3.92 

±0.98 

m 

Multiple 
Estimates(3  /’ s) 

0.948 

0.589 

±0.373 

1 

0.679 

±0.220 

1H 

0.918 

±0.367 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3*  for  variance  reduction 

•  The  AIC  criterion  was  used  for  order  selection  with  maximum  order 

for  the  autoregressive  model  of  K  —  10. 


TABLE  7 


Point  Estimates  and  90%  Confidence  Intervals 
for  Coverage  Probability  by  Applying  Variance  Reduction  Technique 
in- an  M/M/1  Queue  with  p  =  1.0,  p  =0.5 


Multivariate 

True 

Simulation  results  for  different  number  of  cycles 

Method 

Value 

1000 

2000 

3000 

4000 

Control 

Variables 

0.9 

0.80 

±  .12 

0.80 

±  .12 

0.93 

±.08 

0.90 

±.09 

Multiple 

0.9 

1 

0.83 

0.90 

Estimates(2  f's) 

HBfl 

1 

±.11 

±.09 

Multiple 

0.9 

0.67 

■g 

1 

Estimates(3  /’ s) 

1 

±.14 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 
for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 


•  Point  estimates  are  obtained  by  estimating  (3*  for  variance  reduction 

•  The  AIC  criterion  was  used  for  order  selection  with  Maximum  order 

for  the  autoregressive  model  of  K  =  10. 
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TABLE  8 


Comparison  of  Total  CPU  Time  (in  seconds)  for  Different 
Methods  for  Estimating  E{W}  in  an  M/M/1  Queue 
with  (x  =  1.0,  p  =  0.5 


Number  of 

Cycles 

Generate  sample  path 

Autoregressive  method 

No  v.r. 

Multiple  2  /’ s 

Univariate 

Multivariate 

500 

5.85 

13.69 

5.95 

54.26 

1000 

11.51 

26.92  . 

11.02 

98.13 

2000 

22.39 

55.20 

20.76 

187.81 

3000 

33.85 

84.69 

31.06 

288.63 

4000 

43.61 

110.96 

39.74 

372.16 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  v.r.  is  the  abbreviation  for  variance  reduction. 


•  fi  and  /2  were  calculated  for  the  variance  reduction  method. 

•  Multiple  estimate  /i,  /2. 


5.2.  EXAMPLE.  The  Outflow  Process  in  a  Lake 

Let  S(n),  Z(n),  and  X(n)  denote  the  volume  of  water  stored  in  a  lake, 
the  inflow  of  water,  and  the  outflow  of  water  respectively  at  time  n.  Then 
the  amount  of  water  stored  at  time  n  +  1  is  defined  by  the  relation 

S(n  +  1)  =  S(n)  +  Z[n  +  1)  -  X(n  +  1).  (5.2.1) 

If  one  assumes  that  output  increases  with  storage  through  a  linear  relation, 
namely 

X(n)  =  a  ■  S(n),  0  <  a  <  1, 

then  Equation  (5.2.1)  has  the  form 

X(n  +  1)  =  pX(n )  +  e(n  +  1)  (5.2.2) 

where  p  =  1/(1  +  a)  and  e(n)  =  aZ(n)/(  1  +  a).  We  also  assume  that  (e(n)  : 
n  >  0  }  is  a  sequence  of  i.i.d.  random  variables  with 

P{  e(n)  eB}  =  p  •  S0(B )  +  (1  -  p)  [  f(y)dy. 

J  B 

This  yields  a  Markov  process  for  the  outflow  process  (X(n)  :  n  >  0}  with 
state  space  E  =  [0,  oo)  and  probability  transition  function 

P(x,  B)  —  p  •  6px(B )  +  (1  -  p)  /  f(y  -  px)dy. 

J  B 

We  notice  that  (X(n)  :  n  >  0 }  is  actually  a  first  order  autoregressive 

processes.  It  can  be  shown  that  X(n)  =>  X  as  n  — *•  oo  with 


E{X)  =  -  p)  =  E{Z(l)}. 


A  simulation  was  carried  out  to  estimate  E{X}.  We  have  taken  p  =  0.75 
and  the  e(n)  to  have  distribution 

P{  e(n)  <  z  }  =  1  —  (1  —  p)e~x,  x  >  0, 

which  results  in  X  being  exponentially  distributed  with  parameter  1. 

We  observe  (JT(n)  :  n  >  0}  and  use  the  univariate  autoregressive 

method.  It  is  also  possible  to  observe  the  variables  e(n)  during  the  simulation, 
since  we  know  p.  Thus  we  were  able  to  apply  the  innovation  control  method 
for  variance  reduction.  We  use  the  vector  process  {  y(n )  :  n  >  0  },  where 
y(n)  =  (X(n),  e(n)),|  for  the  innovation  control  method.  We  consider  1000, 
2000,  5000,  10000.  observations,  the  longer  runs  being  continuations  of  the 
shorter  runs.  All  runs  were  replicated  30  times.  Notice  that  the  process 
(X(n)  :  n  >  0}  is  indeed  a  first  order  autoregressive  process  and  the  process 
{  y(n)  :  n  >  0  }  can  be  written  as 


which  results  in  y(n).  being  a  first  order  autoregressive  process.  Therefore, 
the  upper  bound  provided  for  order  selection  was  5  for  all  criterions.  Table 
9-12  summarize  the  simulation  results  for  the  lake  model.  We  estimated  the 
optimal  (3 *  by  Equation  (4.3.4)  and  we  observe  that  o^.  was  actually  reduced 
to  zero  as  expected.  The  fact  that  (X(n)  :  n  >  0}  and  (y(n)  :  n  >  0} 
are  finite  order  autoregressive  processes  offers  us  a  chance  to  validate  the 
asymptotic  property  of  k  (cf.  Section  3.3  and  3.4)  obtained  by  various  order 
selection  criteria.  Table  12  contains  the  average  order  determined  by  each 
criterion.  The  table  shows  what  is  to  be  expected,  namely  overestimation  of 
the  order  by  AIC. 


TABLE  9 

Simulation  Results  for  E{X  }  (the  outflow)  in  the  Lake  Model 
with  p  =  0.75,  Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Events 

True 

Value 

No  Variance 

Reduction 

Innovation 

Control 

1000 

1.0 

1.0179 

±0.0239 

0.9912 

±0.0010 

2000 

1.0 

1.0229 

'  ±0.0173 

0.9984 

±0.0006 

5000 

1.0 

1.0151 

±0.0092 

0.9994 

±0.0001 

10000 

1.0 

1.0069 

±0.0070 

0.9997 

±0.0001 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  the  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  —  5. 
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TABLE  10 


Simulation  Results  of  cr2  for  (X(n)}  in  the  Lake  Model 
with  p  =  0.75,  Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Events 

Univariate  Autoregressive  Method 

Innovation  Control 

mm 

AIC 

BIC 

h 

Stat.  T. 

Simulation 

1000 

7.0 

7.2582 

±  .5358 

7.2172 

±  .4759 

7.1686 

±  .4940 

7.2172 

±  .4759 

0.0 

0.0238 

±  .0149 

7.0 

7.1428 

±  .3473 

7.1924 

±.3189' 

7.1402 

±  .3068 

7.1924 

±.3189 

0.0 

0.0149 

±.0117 

5000 

7.0 

7.0935 

±  .2349 

7.1171 

±  .2356 

7.0945 

±  .2305 

1 

0.0 

0.0030 

±  .0006 

10000 

7.0 

6.9887 

±.1616 

7.0245 

±  .1536 

7.0118 

±  .1463 

0.0 

0.0022 

±  .0011 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3*  for  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate  autoregressiv 

method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  5. 

•  True  V.  is  the  abbreviation  for  True  Value. 


•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 


TABLE  11 


Simulation  Results  for  Coverage  Probability  in  the  Lake  Model 
with-  p  =  0.75,  Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Events 

Univariate  Autoregressive  Method 

Innovation  Control 

AIC 

BIC 

h 

Stat.  T. 

Simulation 

1000 

0.9 

0.90 

±.09 

0.90 

±.09 

0.90 

±.09 

0.90 

±  .09 

0.9 

1.0 

±.o 

2000 

0.9 

0.90 

±  .09 

0.90 

±.09 

0.90 

-±  .09 

0.90 

±  .09 

0.9 

1.0 

±.0 

5000 

0.9 

0.97 

±.05 

0.97 

±.05 

0.97 

±  .05 

0.97 

±.05 

0.9 

1.0 

±  .0 

10000 

0.9 

0.93 

±  .08 

0.93 

±  .08 

0.93 

±.08 

0.93 

±.08 

0.9 

1.0 

±  .0 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate  autoregressive 

method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  5. 

•  True  V.  is  the  abbreviation  for  True  Value. 

•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 


TABLE  12 


Order  of  Autoregressive  Model  Selected  by  Different  Criteria  in  Lake  Model, 
with  p  =  0.75,  Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Univariate  Autoregressive  Method 

MAR 

Events 

AIC 

BIC 

h 

Stat.  Test 

Method 

1000 

1.3333 

1.0 

1.0667 

1.0 

1.0000 

±  .2410 

±.o 

±  .0762 

±  .0 

±  .0000 

1.6333 

1.0 

1.0667 

1.0 

1.1333 

2000 

±  .3391 

±  .0 

"±  .0762 

±  .0 

±  .1282 

5000 

1.3333 

1.0 

1.0667 

1.0 

1.1667 

±  .2410 

±.0 

±  .0762 

±.o 

±  .1778 

10000 

1.5000 

1.0 

1.0333 

1.0 

1.1000 

±  .2586 

±.o 

±  .0548 

±  .0 

±  .1908 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3*  for  variance  reduction 

techniques. 

•  AIC  criterion  was  used  to  select  the  order  for  the  multivariate  autoregressive 

method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  5. 

•  MAR  Method  is  abbreviation  for  multivariate  autoregressive  method. 


5.3.  EXAMPLE.  Closed  Network  of  Queues 


The  closed  queueing  network  has  been  used  in  computer  studies  to  model 
multiprogrammed  computer  system  (cf.  Gaver  and  Shedler  (1973)).  Iglehart 
and  Shedler  (1980)  have  written  a  monograph  dealing  with  mathematical 
and  statistical  methods  for  discrete  event  simulation  of  network  of  queues. 
Here  we  consider  the  simple  closed  network  of  queues  with  feedback  shown 
in  Figure  1. 

There  are  two  servers  and  a  fixed  number  of  jobs  N  circulating  in  the 
network.  When  a  job  completes  service  at  center  1,  in  accordance  with  a 
binary-valued  random  variable  tp,  the  job  joins  the  end  of  the  queue  at  center 
1  (when  Tp=l)  or  joins  the  end  of  the  queue  at  center  2  (when  xjj  =  0).  When 
the  job  finishes  service  at  center  2,  the  job  rejoins  the  end  of  the  queue 
at  center  1.  We  assume  that  the  service  discipline  is  first-come-first-serve. 
In  this  example  we  are  interested  in  estimating  the  limiting  passage  time 
(denoted  by  P )  and  the  limiting  response  time  (denoted  by  R).  Informally,  a 
passage  time  of  a  job  is  the  time  for  a  job  to  traverse  a  portion  of  a  network 
and  a  response  time  is  the  time  for  a  job  to  complete  one  cycle  of  a  network. 

For  the  simulation  of  this  queuing  network,  we  make  the  following  prob¬ 
abilistic  assumptions: 

(1)  the  two  sequences  of  service  times  at  both  centers  each  consists  of  i.i.d. 
random  variables,  exponentially  distributed  with  rate  X,-; 

(2)  is  a  Bernoulli  random  variable  with  P{ip  =  1}  =  p  and  values  of 
form  a  sequence  of  i.i.d.  random  variables; 

(3)  the  sequences  in  (1)  and  (2)  are  mutually  independent. 


Fig.  1.  Closed  network  of  queues. 


Let  {  P(n )  :  n  >  1  }  (respectively  {  R[n )  :  n  >  1 })  be  the  sequence  of 
passage  times  (respectively  response  times)  enumerated  in  order  of  passage 
time  starts  and  let  { S(n )  :  n  >  1 }  be  the  service  time  experienced  at 

center  1  associated  with  P(n).  We  notice  that  (S(n)}  is  a  sequence  of  i.i.d. 
random  variables.  Iglehart  and  Shedler  (19S0)  have  shown  that  P{n )  =»  P 
and  R(n)  =>  R  as  n  -  oo. 

The  simulation  was  carried  out  to  estimate  both  E{P}  and  E{R }  and 
we  have  taken  N  =  2,  Xi  =  1.0,  X2  =  0.5  and  p  —  0.75.  We  observe 
the  { P(n )  :  n  >  0},  {  R{n)  :  n  >  0},  and  (5(n)  :  n  >  0}  processes 
and  use  S(n)  as  the  control  variable  when  we  estimate  E{P)  and  E(R).  We 
consider  1000,  2000,  3000,  5000,  and  10000  observations,  the  longer  runs 
being  continuations  of  the  shorter  runs.  All  runs  were  replicated  30  times. 

Table  13-19  summarize  the  simulation  results  for  estimating  E{P}  and 


E{R}.  The  calculation  of  E{P},  E{R}  and  the  variance  constant  cor¬ 
responding  to  passage  times  can  be  foundln  Section  9  of  Iglehart  and  Shedler 
(1980).  To  evaluate  the  variance  reduction  for  the  control  variables  method, 
we  carried  out  a  similar  computation  discussed  in  Iglehart  and  Shedler  (1980). 
Although  we  included  the  simulation-  results  for  the  variance  constant  of 
response  times,  we  can  not  provide  the  theoretical  value. 
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TABLE  13 


Simulation  Results  for  E{P  }  (Passage  Time)  and  E{R}  (Response  Time) 
in  a  Closed  Network  of  Queues  with  N  =  2,  \\  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Events 

P  (Passage  Time) 

R  (Response  Time) 

True  Value 

No  v.r. 

With  v.r. 

True  Value 

No  v.r. 

With  v.r. 

500 

6.667 

BBH 

6. 684 

±  .019 

9.333 

9.363 

±.128 

9.334 

±  .044 

1000 

6.667 

I 

6.667 

±7014 

9.333 

9.293 

±  .091 

9.342 

±  .035 

2000 

6.667 

6.638 

±  .053 

6.661 

±  .009 

9.333 

9.309 

±.060 

9.331 

±.023 

5000 

6.667 

6.660 

±  .039 

6.666 

±  .005 

9.333 

9.332 

±  .014 

10000 

6.667 

1 

6  668 

±.027 

6.664 

±.003 

9.333 

9.337 

±  .024 

9.333 

±  .009 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  the  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  v.r.  is  the  abbreviation  for  variance  reduction. 
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TABLE  14 


Simulation  Results  for  a2  of  Passage  Time  (P) 
in  a  Closed  Network  of  Queues  with  N  =  2,  Xi  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Events 

Univariate  Autoregressive  Method 

Control  Variables 

AIC 

BIC 

h 

Stat.  T. 

Simulation 

500 

58.667 

59.887 

±4.511 

57.081 

±  3.392 

58.164 

±  3.201 

48.922 

±  3.744 

1.581 

2.300 

±  .132 

1000 

58.667 

58.794 

±  2.768 

56.220 

±2.349 

53.305 

±  2.887 

1.581 

MW™ 

2000 

58.667 

58.128 

±2.168 

55.607 

±  1.510 

56.203 

±  1.533 

54.830 

±  1.721 

1.581 

5000 

58.667 

59.746 

±  1.112 

57.262 

±  0.798 

58.565 

±  1.025 

56.989 

±  0.782 

1.581 

1 

lIBSSi 

10000 

58.667 

58.737 

±  0.995 

58.218 

±  0.807 

56.474 

±  0.575 

1.581 

1 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3 *  for  the  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 

•  True  V.  is  the  abbreviation  for  True  Value. 

•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 


TABLE  15 


Simulation  Results  for  a2  of  Response  Time  (R) 
in  a  Closed  Network  of  Queues  with  N  =  2,  Xi  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Univariate  Autoregressive  Method 

Control  Variables 

Events 

AIC 

BIC 

h 

Stat.  T. 

Method 

500 

60.274 

62.094 

61.891 

58.065 

9.025 

±  4.840 

±4.119 

±  3.834 

±  4.468 

±  0.582 

59.989 

60.527 

61.039 

60.216 

8.879 

±  3.471 

±  2.752 

±  2.745 

±  3.088 

±  0.450 

59.602 

62.133 

60.621 

61.923 

8.650 

±  1.755 

±  1.600 

±  1.643 

±  1.885 

±  0.323 

60.909 

62.493 

I 

63.041 

8.305 

±  1.067 

±0.928 

±  0.834 

±  0.277 

10000 

59.160 

60.182 

60.006 

60.985 

8.354 

±  0.863 

±  0.832 

±  0.806 

±  0.861 

±  0.187  • 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3*  for  the  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 

•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 


TABLE  16 

Simulation  Results  for  Coverage  Probability(  =0.9)  of  Passage  Times  (P) 
in  a  Closed  Network  of  Queues  with  N  =  2,  \j  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Univariate  Autoregressive  Method 

Control  Variables 

Events 

A1C 

BIC 

h 

Stat.  Test 

Method 

500 

0.90 

0.90 

0.90 

0.83 

0.93 

±.09 

±.09 

±.09 

±.11 

0.93 

0.93 

0.93 

0.87 

0.87 

±.08 

±.08 

±  .08 

— 

HjQI 

isa 

0.87 

0.87 

0.90 

m 

hem 

±.10 

±0.09 

a 

0.80 

0.80 

0.80 

0.80 

0.90 

±.12 

±.12 

±.12 

±.12 

±0.09 

10000 

0.83 

0.83 

0.83 

0.83 

0.93 

±.11 

±.11 

±.11 

±.11 

±0.08 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  variance  reduction 

techniques. 

•  The  AJC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 

•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 
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TABLE  17 


Simulation  Results  for  Coverage  Probability^  =0.9)  of  Response  Times  (R) 
in  a  Closed  Network  of  Queues  with  N  =  2,  Xi  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Events 

Univariate  Autoregressive  Method 

Control  Variables 

Method 

AIC 

BIC 

h 

Stat.  T. 

500 

EK9 

0.87 

±.10 

0.87 

±.10 

0.80 

±0.12 

1000 

0.80 

±.12 

0.80 

±  .12 

0.80 

±  .12 

0.83 

±.11 

0.87 

±0.12 

2000 

1 

BUM 

0.87 

±.10 

0.87 

±  .10 

H9 

0.87 

±0.10 

0.93 

±.08 

0.93 

±  .08  . 

0.93 

±.08 

0.93 

±.08 

1',  « SI* [(*■'■  & ' 

10000 

11111 

0.87 

±  .10 

m 

0.86 

±  .10 

0.93 

±0.08 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  for  the  variance  reduction 

techniques. 

•  The  AIC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 

•  Stat.  T.  is  the  abbreviation  for  Statistical  Test. 


TABLE  18 


Order  of  Autoregressive  Model  Selected  by  Different  Criteria 
for  the  Passage  Times  (P) 

in  a  Closed  Network  of  Queues  with  N  =  2,  Xj  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Univariate  Autoregressive  Method 

Events 

AJC 

BIC 

h 

Stat.  Test 

500 

2.07 

■rcra 

1 

0.27 

±  .62 

±  .16 

1 

1.83 

0.93 

1 

0.70 

±  .40 

±.n 

±.14 

1 . H 

1.97 

1.03 

1.20 

0.93 

±.33  . 

±.06 

±.15 

±  .08 

5000 

1.10 

1.43 

1.07 

±.09 

±.22 

±.08 

10000 

■n 

1.27 

1.67 

1.13 

■m 

±.13 

±.16 

±.10 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  /3*  for  the  variance  reduction 

techniques. 

•  The  AJC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 


TABLE  19 


Order  of  Autoregressive  Model  Selected  by  Different  Criteria 
for  the  Response  Times  (R) 

in  a  Closed  Network  of  Queues  with  N  —  2,  Xi  =  1.0,  X2  =  0.5,  p  =  0.75 


Number  of 

Univariate  Autoregressive  Method 

Events 

AIC 

BIC 

h 

Stat.  Test 

500 

1.90 

0.83 

1.27 

0.47 

±.53 

±  .21 

±.25 

±.17 

mm 

0.97 

0.73 

1000 

MSIllW 

±.15 

±  .14 

mm 

1.97 

1.03 

■n 

0.93 

±  .39 

±  .55 

II 

±  .08 

2.30 

1.20 

— 

1.10 

±.43 

±.15 

±  .12 

2.77 

1.50 

| 

1.30 

±  .42 

±.15 

■H 

±.14 

•  Results  are  based  on  30  independent  replications;  the  central  limit  theorem 

for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  Point  estimates  are  obtained  by  estimating  (3*  for  the  variance  reduction 

techniques. 

•  AJC  criterion  was  used  to  select  the  order  for  the  multivariate 

autoregressive  method. 

•  The  maximum  order  for  the  autoregressive  model  is  K  —  25. 
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5.4.  EXAMPLE.  Queue  Length  Process  in  a  Single  Server  Cyclic 


Queue 

We  consider  a  closed  system  consisting  of  two  service  stations  shown  in 
Figure  2.  There  are  a  fixed  number  of  jobs  N  circulating  in  the  system. 
The  departure  process  of  each  queue  is  the  arrival  process  of  the  other.  We 
assume  that  the  service  times  at  both  stations  are  mutually  independent  and 
have  general  distribution  functions. 

For  t  >  0,  let  X(t)  be  the  number  of  jobs  both  waiting  and  being  served 
at  station  A  at  time  f.  The  state  space  E  of  the  process  (X(t)  :  t  >  0} 
is  E  =  {  0, 1, . . . ,  N}.  Then  the  process  (X(f)  :  t  >  0}  is  a  generalized 
semi-Markov  process  (GSMP)  (cf.  Whitt  (1980)).  Let  {( Xn,Cn )  :  n  >  0}  be 
the  embedded  jump  process  for  the  GSMP,  then  we  can  reconstruct  (X(t)  : 
t  >  0}  from  {(Xn,C7„) :  n  >  0}.  First  let 


n— 1 


where  Cm|1-  is  the  value  of  the  *th  clock  reading  at  the  mth  jump  of  X(t). 
Then 

OO 

X(t)~  2lMln.n+i))Xk. 

fc= 0 

The  only  events  that  can  occur  are  a  service  completion  by  A  or  by  B. 
Therefore,  the  clock  vector  c  is  a  pair  recording  the  service  times  left  at 
stations  A  and  B  respectively. 

We  are  interested  in  estimating  the  expected  number  of  jobs  at  station  A 
when  the  state  is  in  equilibrium.  We  have  taken  N  —  3,  and  the  service  times 
to  be  gamma  (2,1)  for  server  A  and  gamma  (3,1)  for  server  B.  We  consider 
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Fig.  2.  Single  server  cyclic  queue. 


400,  S00,  1200,  1600,  2000  cycles,  the  longer  runs  being  continuations  of  the 
shorter  runs.  For  all  our  runs  the'cycles  were  based  on  returns  to  the  state 
0.  We  use  the  discrete  time  method  discussed  in  Section  3.4.  All  runs  were 
replicated  30  times.  The  simulation  results  for  this  model  are  displayed  in 
Tables  20-23. 


TABLE  20 


Simulation  Results  for  Queue  Length  Process  at  Station  A 
in  a  Single  Server  Cyclic  Queue,  3-jobs,  T(2, 1),  T(3, 1)  servers. 


TABLE  21 

Simulation  Results  for  a2  =  4.0771  in  a  Single  Server 
Cyclic  Queue,  3-jobs,  T(2, 1),  r(3, 1)  Servers. 
Point  Estimates  and  90%  Confidence  Intervals 


Number  of  Regenerative 
Cycles  Method 


3.3900  • 

100 

±.3211 


3.9643 

±.2077 


3.9909 

500 

±.1502 


800 


1000 


Univariate  Autoregressive  Method 


A1C 


3.9525 
±  .4073 


±  .3279 


4.2345 
±  .3187 


BIG 


3.8601 
±  .4598 


4.2183 
±  .2059 


4.2534 
±  .1689 


3.8960 
±  .3724 


4.2844 
±  .2443 


4.3404 
±  .2345 


4.2213 

±  .2274 

4.4250 

±.1621 

4.3522 

±  .2131 

4.2460 

±  .1948 

4.4268 

± .1506 

4.3391 

±  .1634 

Stat.  Test 


3.88703 
± .4364 


4.22976 
±  .2989 


±  .3059 


4.3413 
±  .2092 


4.3 
±  -2 


•  Results  are  based  on  30  independent  replications;  the  central  limit 

theorem  for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  The  maximum  order  for  the  autoregressive  model  is  K  —  25. 
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TABLE  22 


Simulation  Results  for  Coverage  Probability  (0.90)  in  a  Single  Server 
Cyclic  Queue,  3-jobs,  T(2, 1),  T(3, 1)  Servers. 

Point  Estimates  and  90%  Confidence  Intervals 


Number  of 

Regenerative 

Univariate  Autoregressive  Method 

Cycles 

Method 

AIC 

BIC 

h 

Stat.  Test 

100 

0.90 

0.93 

0.93 

0.93 

0.93 

±0.09 

±  .08 

±.08 

±  .08 

±.08 

300 

0.93 

0.97 

0.97 

1.00 

0.97 

±0.08 

±  .06 

±.06 

±  .00 

±  .06 

500 

0.97 

MgM 

1.00 

1.00 

±0.06 

1 

±.00 

±  .00 

±.00 

800 

1.00 

1.00 

1.00 

1.00 

1.00 

±0.00 

±  .00 

■SB 

±.00 

±  .06 

1000 

0.90 

0.97 

0.97 

0.97 

0.97 

±0.09 

±.06 

±.06 

±  .06 

±  .06 

•  Results  are  based  on  30  independent  replications;  the  central  limit 

theorem  for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 

•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 
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TABLE  23 


Order  of  Autoregressive  Model  selected  by  Different  Criteria 
for  Queue  Length  Process  at  Station  A  in  a  Single  Cycle  Queue, 
3-jobs,  T(2, 1),  r(3, 1)  servers. 


Number  of 

Univariate  Autoregressive  Method 

Cycles 

AIC 

BIC 

h 

Stat.  Test 

100 

6.10 

2.27 

3.20 

4.33 

±1.53 

±  .33 

±.58 

±  1.69 

300 

12.57 

2.70 

5.43 

8.53 

±1.78 

±.32 

±.89 

±  1.79 

500 

18.10 

3.70 

7.70 

14.20 

±  1.65 

±.58 

±  1.01 

±  1.75 

800 

1 

11.90 

18.9 

±  1.14 

±1.43 

1000 

23.57 

6.33 

13.47 

21.40 

±.55 

±.81 

±  1.15 

±  1.04 

•  Results  are  based  on  30  independent  replications;  the  central  limit 

theorem  for  i.i.d.  random  variables  was  used  to  form  confidence  intervals. 


•  The  maximum  order  for  the  autoregressive  model  is  K  =  25. 
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CHAPTER  VI 


CONCLUSIONS 

i 

As  the  use  of  computer  simulation  becomes  more  important  in  the  study 
of  complex  phenomena,  the  need  to  develop  theoretically  sound  and  com¬ 
putationally  efficient  methods  for  simulation  output  analysis  becomes  more 
pressing.  The  autoregressive  method  proposed  in  this  paper  uses  techniques 
developed  for  time  series  analysis  to  provide  both  point  and  interval  estimates 
for  parameters  associated  with  the  steady-state  distribution, 
ter  we  shall  examine  the  advantages  and  disadvantages  of  the 

4S 

method. 

The  major  advantage  of  the  autoregressive  method  is  obvious.  It  serves 
as  a  black  box;  users  provide  the  simulation  output  sequence,  the  black  box 
will  produce  results  automatically.  Risers  need  not  devote  time  in  analyzing 
the  system  as  they  must  for  the  regenerative  method,  furthermore,  it  seems 
that  the  autoregressive  method  applies  to  a  much  broader  class  of  stochas¬ 
tic  processes  than  the  regenerative  method  does.  With  the  generalization  to 
multidimensional  processes,  the  method  enables  us  to  apply  variance  reduc¬ 
tion  techniques  to  get  more  accurate  point  estimates  along  with  more  precise 
interval  estimates. 

V  4  ■ 

i  A.  A 

The  disadvantages  of  the  autoregressive  method  are  ake-oterr.  First, 

the  covariance  matrix  obtained  by  the  autoregressive  method  is  just  an  ap- 
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In  this  chap- 
utoregressive 


proximation  for  the  covariance  matrix  present  in  the  central  limit  theorem 

yT 

used  to  construct  confidence  intervals/  Seeemiy  the  assumptions  Tfe  put  on  the 
system  under- sfctrdy  are  stricter  than  we  would  like.^The  most  important  of 
these  are  the  requirement  that  the  process  be  ^-mixing,  the  initial  distribu¬ 
tion  /i  be  absolutely  continuous  with  respect  to  the  stationary  distribution  7r, 
and  the  Radon-Nikodym  derivative  be  bounded  above. 

We  would  like  to  point  out  some  areas  that  present  potential  for  research 
and  development.  First,  try  to  relax  the  assumptions  we  have  made  about 
the  system  under  study.  Second,  to'justiy  as  well  as  find  some  conditions 
which  allow  us  to  apply  the  autoregressive  method  for  processes  other  than 
Markovian  or  Semi-Markovian.  Finally,  to  design  order  selection  criteria 
especially  for  the  multidimensional  autoregressive  method. 
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Let  {  y(n)  :  n  >  0  }  be  a  d-dimensional  strictly,  stationary  process  which 
satisfies  some  regularity  conditions  so  that  the  central  limit  theorem 

i  n-l 

vs(-  E  »(*')  -/*)=*■  mo.  as  n  — *■  oo 
n  »=o 

holds,  where  =$■  denotes  weak  convergence  and  N(0,  £)  is  a  d-dimensional 
normal  vector  with  mean  0  and  covariance  matrix  S.  It  is  demonstrated  that 
E  can  be  approximated  arbitrarily  closely  by  the  spectral  density  function 
/(X)  at  zero  of  an  autoregressive  process.  Based  on  this  approximation, 
we  propose  an  autoregressive  method  to  estimate  the  covariance  matrix  £ 
from  an  observed  sequence.  We  derive  some  recursive  formulae  to  estimate 
the  parameters  for  a  given  order  autoregressive  process  and  discuss  several 
criteria  used  to  select  the  order.  Under  suitable  assumptions  the  consistency 
of  the  estimate  for  E  can  be  established.  The  method  is  applicable  to  the 
simulation  of  some  discrete  or  continuous  time  Markov  chains  and  semi- 
Markov  processes.  Here  the  central  limit  theorem  above  can  be  used  to 
construct  confidence  intervals  for  parameters  associated  with  the  steady-state 
behavior  of  these  processes.  The  method  can  also  be  justified  for  certain  non¬ 
stationary  processes.  Several  variance  reduction  techniques  are  developed 
which  enable  us  to  shorten  the  confidence  intervals  constructed. 
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